IMarvinTPA commented on issue #55882:
URL: https://github.com/apache/spark/issues/55882#issuecomment-4487916638
> Could you reproduce the problem on Linux/Mac? I don't think spark in
general cares too much about Windows. We don't have any Windows related test
and most of our users do not use it on Windows. I tried this on my macbook and
it works fine.
My work machine is Windows. So that's where I'm running into the problem.
I have a library that lets me run code in both Databricks and against SQL
server or Postgres on a local machine with minimal changes in the working
script by writing the correct SQL translations and I have functions that
convert between Pandas, Spark on Pandas API, and Spark. And UDFs are important
for being able to manipulate the data for custom manipulations.
> The code is a bit weird though - `cols2 = list(map(list, zip(*cols)))`
what are you trying to achieve here?
My `cols` variable in my main code is a list of lists where the nested list
is a list of all of the values for that column. Spark wants each row of the
outer list to contain values for each column in the row with a name. So this
just pivots the data from `[[val_for_col1_row1, val_for_col1_row2],
[val_for_col2_row1, val_for_col2_row2]]` into `[{"col1" : val_for_col1_row1,
"col2": val_for_col2_row1}, {"col1" : val_for_col1_row2, "col2":
val_for_col2_row2}]`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]