[ https://issues.apache.org/jira/browse/SPARK-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14260343#comment-14260343 ]
Apache Spark commented on SPARK-4968: ------------------------------------- User 'saucam' has created a pull request for this issue: https://github.com/apache/spark/pull/3830 > [SparkSQL] java.lang.UnsupportedOperationException when hive partition > doesn't exist and order by and limit are used > -------------------------------------------------------------------------------------------------------------------- > > Key: SPARK-4968 > URL: https://issues.apache.org/jira/browse/SPARK-4968 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.1.1 > Environment: Spark 1.1.1 > scala - 2.10.2 > hive metastore db - pgsql > OS- Linux > Reporter: Shekhar Bansal > Fix For: 1.1.1, 1.1.2, 1.2.1 > > > Create table with partitions > run query for partition which doesn't exist and contains order by and limit > I am running queries in hiveContext > 1. Create hive table > create table if not exists testTable (ID1 BIGINT, ID2 BIGINT,Start_Time > STRING, End_Time STRING) PARTITIONED BY (Region STRING,Market STRING) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY ',' > LINES TERMINATED BY '\n' > STORED AS TEXTFILE; > 2. Create data > 1,2,"2014-11-01","2014-11-02" > 2,3,"2014-11-01","2014-11-02" > 3,4,"2014-11-01","2014-11-02" > 3. Load data in hive > LOAD DATA LOCAL INPATH '/tmp/input.txt' OVERWRITE INTO TABLE testTable > PARTITION (Region="North", market='market1'); > 4. run query > SELECT * FROM testTable WHERE market = 'market2' ORDER BY End_Time DESC LIMIT > 100; > Error trace > java.lang.UnsupportedOperationException: empty collection > at org.apache.spark.rdd.RDD$$anonfun$reduce$1.apply(RDD.scala:863) > at org.apache.spark.rdd.RDD$$anonfun$reduce$1.apply(RDD.scala:863) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.reduce(RDD.scala:863) > at org.apache.spark.rdd.RDD.takeOrdered(RDD.scala:1136) > at > org.apache.spark.sql.execution.TakeOrdered.executeCollect(basicOperators.scala:171) > at org.apache.spark.sql.SchemaRDD.collect(SchemaRDD.scala:438) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org