[GitHub] [iceberg] lordk911 opened a new issue #1463: Time travel not work

GitBox Tue, 15 Sep 2020 23:09:45 -0700


lordk911 opened a new issue #1463:
URL: https://github.com/apache/iceberg/issues/1463



   I'm testing with spark3.0.1 and cdh5.14 ,iceberg0.9.1.  and spark-shell
   catalog config is :
   ```
   spark.sql.catalog.hadoop_prod               
org.apache.iceberg.spark.SparkCatalog
   spark.sql.catalog.hadoop_prod.type          hadoop
   spark.sql.catalog.hadoop_prod.warehouse     
hdfs://hdfsnamespace/user/hive/warehouse
   ```
   
   I tried to create a table : 
   `scala> spark.sql("CREATE TABLE spark_catalog.gjst.icetest (id bigint, data 
string)  USING iceberg PARTITIONED BY (id) ")`
   
   insert some value:
   ```
   scala> spark.sql("INSERT INTO hadoop_prod.ice.icetest VALUES (1, 'a'), (2, 
'b'), (3, 'c')")
   scala> spark.sql("select * from spark_catalog.gjst.icetest ").show(false)
   +---+----+                                                                   
   
   |id |data|
   +---+----+
   |1  |a   |
   |2  |b   |
   |3  |c   |
   +---+----+
   ```
   delete a partion: 
   ```
   scala> spark.sql("delete from spark_catalog.gjst.icetest where 
id=1").show(false)
   scala> spark.sql("select * from spark_catalog.gjst.icetest ").show(false)
   +---+----+
   |id |data|
   +---+----+
   |2  |b   |
   |3  |c   |
   +---+----+
   
   ```
   
   insert some value: 
   ```
   scala> spark.sql("INSERT INTO spark_catalog.gjst.icetest VALUES (1, 'a'), 
(1, 'b'), (1, 'c');")
   scala> spark.sql("select * from spark_catalog.gjst.icetest ").show(false)
   +---+----+
   |id |data|
   +---+----+
   |2  |b   |
   |3  |c   |
   |1  |a   |
   |1  |b   |
   |1  |c   |
   +---+----+
   
   ```
   
   show snapshots: 
   ```
   scala> spark.sql("select committed_at, snapshot_id, parent_id, operation 
from hadoop_prod.ice.icetest.snapshots").show(false)
   +-----------------------+-------------------+------------------+---------+
   |committed_at           |snapshot_id        |parent_id         |operation|
   +-----------------------+-------------------+------------------+---------+
   |2020-09-16 13:32:39.952|628886310322778010 |null              |append   |
   |2020-09-16 13:42:34.109|598127609483871079 |628886310322778010|delete   |
   |2020-09-16 13:43:14.415|6880502734717374864|598127609483871079|append   |
   +-----------------------+-------------------+------------------+---------+
   ```
   
   but every snapshot I read ,show the same last state of the table: 
   ```
   scala> val df2 = spark.read.option("snapshot-id", 
628886310322778010L).table("hadoop_prod.ice.icetest")
   df2: org.apache.spark.sql.DataFrame = [id: bigint, data: string]
   
   scala> df2.show
   +---+----+                                                                   
   
   | id|data|
   +---+----+
   |  2|   b|
   |  3|   c|
   |  1|   a|
   |  1|   b|
   |  1|   c|
   +---+----+
   
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] lordk911 opened a new issue #1463: Time travel not work

Reply via email to