[ https://issues.apache.org/jira/browse/HADOOP-18856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Steve Loughran updated HADOOP-18856: ------------------------------------ Summary: Spark insertInto with location GCS bucket root not supported (was: Spark insertInto with location GCS bucket root causes NPE) > Spark insertInto with location GCS bucket root not supported > ------------------------------------------------------------ > > Key: HADOOP-18856 > URL: https://issues.apache.org/jira/browse/HADOOP-18856 > Project: Hadoop Common > Issue Type: Bug > Components: common > Affects Versions: 3.3.3 > Reporter: Dipayan Dev > Priority: Minor > > > {noformat} > scala> import org.apache.hadoop.fs.Path > import org.apache.hadoop.fs.Path > scala> val path: Path = new Path("gs://test_dd123/") > path: org.apache.hadoop.fs.Path = gs://test_dd123/ > scala> path.suffix("/num=123") > java.lang.NullPointerException > at org.apache.hadoop.fs.Path.<init>(Path.java:150) > at org.apache.hadoop.fs.Path.<init>(Path.java:129) > at org.apache.hadoop.fs.Path.suffix(Path.java:450){noformat} > > Path.suffix throws NPE when writing into GS buckets root. > > In our Organisation, we are using GCS bucket root location to point to our > Hive table. Dataproc's latest 2.1 uses *Hadoop* *3.3.3* and this needs to be > fixed in 3.3.3. > Spark Scala code to reproduce this issue > {noformat} > val DF = Seq(("test1", 123)).toDF("name", "num") > DF.write.option("path", > "gs://test_dd123/").mode(SaveMode.Overwrite).partitionBy("num").format("orc").saveAsTable("schema_name.table_name") > val DF1 = Seq(("test2", 125)).toDF("name", "num") > DF1.write.mode(SaveMode.Overwrite).format("orc").insertInto("schema_name.table_name") > java.lang.NullPointerException > at org.apache.hadoop.fs.Path.<init>(Path.java:141) > at org.apache.hadoop.fs.Path.<init>(Path.java:120) > at org.apache.hadoop.fs.Path.suffix(Path.java:441) > at > org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.$anonfun$getCustomPartitionLocations$1(InsertIntoHadoopFsRelationCommand.scala:254) > {noformat} > > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org