[Spark Dataframe] How to load compressed file? (lz4, snappy)

HelloWorld Wed, 22 Jun 2022 06:11:44 -0700

Hello. I am developer who is learning spark programming

I am asking for help because it is difficult for me to solve the current
problem on my own.



My development environment is as follows.
-------------------------------------------------------------------------------------------------------
OS : Windows 11
Docker Desktop 4.9.0
Docker Image : bitnami/spark (https://hub.docker.com/r/bitnami/spark)
Language : Scala
-------------------------------------------------------------------------------------------------------


The execution result is as follows.
-------------------------------------------------------------------------------------------------------
scala> val test = spark.read.json("lz4_block.lz4")
test: org.apache.spark.sql.DataFrame = []

scala> test.first()
java.lang.UnsupportedOperationException: empty collection
  at org.apache.spark.rdd.RDD.$anonfun$first$1(RDD.scala:1465)
  at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
  at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
  at org.apache.spark.rdd.RDD.withScope(RDD.scala:414)
  at org.apache.spark.rdd.RDD.first(RDD.scala:1463)
  ... 47 elided
-------------------------------------------------------------------------------------------------------


Except for the uncompressed file, all files included in the attached file
gave the same result.

How do I get the right result?

Thanks for the help.

<<attachment: data.zip>>

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

[Spark Dataframe] How to load compressed file? (lz4, snappy)

Reply via email to