How to set a config for a single query?

2023-01-03 Thread Felipe Pessoto
Hi, In Scala is it possible to set a config value to a single query? I could set/unset the value, but it won't work for multithreading scenarios. Example: spark.sql.adaptive.coalescePartitions.enabled = false queryA_df.collect

[SparkR] Compare datetime with Sys.time() throws error in R (>= 4.2.0)

2023-01-03 Thread Vivek Atal
Hi, Base R 4.2.0 introduced a change ([Rd] R 4.2.0 is released), "Calling if() or while() with a condition of length greater than one gives an error rather than a warning." The below code is a reproducible example of the issue. If it is executed in R >=4.2.0 then it will generate an error, else

Re: Incorrect csv parsing when delimiter used within the data

2023-01-03 Thread Sean Owen
Why does the data even need cleaning? That's all perfectly correct. The error was setting quote to be an escape char. On Tue, Jan 3, 2023, 2:32 PM Mich Talebzadeh wrote: > if you take your source CSV as below > > "a","b","c" > "1","","," > "2","","abc" > > > and define your code as below > > >

Re: Incorrect csv parsing when delimiter used within the data

2023-01-03 Thread Mich Talebzadeh
if you take your source CSV as below "a","b","c" "1","","," "2","","abc" and define your code as below csv_file="hdfs://rhes75:9000/data/stg/test/testcsv.csv" # read hive table in spark listing_df = spark.read.format("com.databricks.spark.csv").option("inferSchema",

Re: Incorrect csv parsing when delimiter used within the data

2023-01-03 Thread Sean Owen
No, you've set the escape character to double-quote, when it looks like you mean for it to be the quote character (which it already is). Remove this setting, as it's incorrect. On Tue, Jan 3, 2023 at 11:00 AM Saurabh Gulati wrote: > Hello, > We are seeing a case with csv data when it parses csv

Incorrect csv parsing when delimiter used within the data

2023-01-03 Thread Saurabh Gulati
Hello, We are seeing a case with csv data when it parses csv data incorrectly. The issue can be replicated using the below csv data "a","b","c" "1","","," "2","","abc" and using the spark csv read command. df = spark.read.format("csv")\ .option("multiLine", True)\ .option("escape", '"')\