[GitHub] [incubator-iceberg] rdblue closed issue #765: [Incremental Scan] Follow ups for #315

2020-02-10 Thread GitBox
rdblue closed issue #765: [Incremental Scan] Follow ups for #315 URL: https://github.com/apache/incubator-iceberg/issues/765 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [incubator-iceberg] rdblue merged pull request #781: Incremental scan followups

2020-02-10 Thread GitBox
rdblue merged pull request #781: Incremental scan followups URL: https://github.com/apache/incubator-iceberg/pull/781 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [incubator-iceberg] rdblue merged pull request #783: Incremental scan followups

2020-02-10 Thread GitBox
rdblue merged pull request #783: Incremental scan followups URL: https://github.com/apache/incubator-iceberg/pull/783 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [incubator-iceberg] chenjunjiedada commented on a change in pull request #786: replace SparkDataFile with DataFile

2020-02-10 Thread GitBox
chenjunjiedada commented on a change in pull request #786: replace SparkDataFile with DataFile URL: https://github.com/apache/incubator-iceberg/pull/786#discussion_r377412697 ## File path: spark/src/main/scala/org/apache/iceberg/spark/SparkTableUtil.scala ## @@ -329,21

[GitHub] [incubator-iceberg] chenjunjiedada commented on issue #786: replace SparkDataFile with DataFile

2020-02-10 Thread GitBox
chenjunjiedada commented on issue #786: replace SparkDataFile with DataFile URL: https://github.com/apache/incubator-iceberg/pull/786#issuecomment-584451751 python build is failed. @aokolnychyi could you please help to trigger CI?

[GitHub] [incubator-iceberg] rdsr commented on a change in pull request #782: Incremental scan followups

2020-02-10 Thread GitBox
rdsr commented on a change in pull request #782: Incremental scan followups URL: https://github.com/apache/incubator-iceberg/pull/782#discussion_r377408828 ## File path: core/src/main/java/org/apache/iceberg/IncrementalDataTableScan.java ## @@ -137,4 +136,29 @@ protected

[GitHub] [incubator-iceberg] chenjunjiedada commented on issue #786: replace SparkDataFile with DataFile

2020-02-10 Thread GitBox
chenjunjiedada commented on issue #786: replace SparkDataFile with DataFile URL: https://github.com/apache/incubator-iceberg/pull/786#issuecomment-584445039 @aokolnychyi , Thanks for the review, just updated. This is an

[GitHub] [incubator-iceberg] aokolnychyi commented on a change in pull request #786: replace SparkDataFile with DataFile

2020-02-10 Thread GitBox
aokolnychyi commented on a change in pull request #786: replace SparkDataFile with DataFile URL: https://github.com/apache/incubator-iceberg/pull/786#discussion_r377404989 ## File path: spark/src/main/scala/org/apache/iceberg/spark/SparkTableUtil.scala ## @@ -492,10

[GitHub] [incubator-iceberg] aokolnychyi commented on a change in pull request #786: replace SparkDataFile with DataFile

2020-02-10 Thread GitBox
aokolnychyi commented on a change in pull request #786: replace SparkDataFile with DataFile URL: https://github.com/apache/incubator-iceberg/pull/786#discussion_r377403403 ## File path: spark/src/main/scala/org/apache/iceberg/spark/SparkTableUtil.scala ## @@ -329,21

[GitHub] [incubator-iceberg] aokolnychyi commented on a change in pull request #786: replace SparkDataFile with DataFile

2020-02-10 Thread GitBox
aokolnychyi commented on a change in pull request #786: replace SparkDataFile with DataFile URL: https://github.com/apache/incubator-iceberg/pull/786#discussion_r377398380 ## File path: spark/src/main/scala/org/apache/iceberg/spark/SparkTableUtil.scala ## @@ -174,22

[GitHub] [incubator-iceberg] aokolnychyi commented on a change in pull request #786: replace SparkDataFile with DataFile

2020-02-10 Thread GitBox
aokolnychyi commented on a change in pull request #786: replace SparkDataFile with DataFile URL: https://github.com/apache/incubator-iceberg/pull/786#discussion_r377402464 ## File path: spark/src/main/scala/org/apache/iceberg/spark/SparkTableUtil.scala ## @@ -200,50

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #782: Incremental scan followups

2020-02-10 Thread GitBox
rdblue commented on a change in pull request #782: Incremental scan followups URL: https://github.com/apache/incubator-iceberg/pull/782#discussion_r377409714 ## File path: core/src/main/java/org/apache/iceberg/IncrementalDataTableScan.java ## @@ -137,4 +136,29 @@ protected

[GitHub] [incubator-iceberg] rdsr commented on issue #782: Incremental scan followups

2020-02-10 Thread GitBox
rdsr commented on issue #782: Incremental scan followups URL: https://github.com/apache/incubator-iceberg/pull/782#issuecomment-584463615 Python build failed. Seems like the failure is unrelated. ``` ERROR: invocation failed (exit code 1), logfile:

[GitHub] [incubator-iceberg] openinx commented on issue #788: Integrate the Apache Flink into Apache Iceberg

2020-02-10 Thread GitBox
openinx commented on issue #788: Integrate the Apache Flink into Apache Iceberg URL: https://github.com/apache/incubator-iceberg/issues/788#issuecomment-584503640 @aokolnychyi How do you think about the document ? I mean, would the flink's necessary incremental consumption and low

[GitHub] [incubator-iceberg] rdsr commented on a change in pull request #782: Incremental scan followups

2020-02-10 Thread GitBox
rdsr commented on a change in pull request #782: Incremental scan followups URL: https://github.com/apache/incubator-iceberg/pull/782#discussion_r377410460 ## File path: core/src/main/java/org/apache/iceberg/IncrementalDataTableScan.java ## @@ -137,4 +136,29 @@ protected

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object.

2020-02-10 Thread GitBox
rdblue commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object. URL: https://github.com/apache/incubator-iceberg/pull/784#discussion_r377401371 ## File path:

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object.

2020-02-10 Thread GitBox
rdblue commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object. URL: https://github.com/apache/incubator-iceberg/pull/784#discussion_r377400868 ## File path:

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object.

2020-02-10 Thread GitBox
rdblue commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object. URL: https://github.com/apache/incubator-iceberg/pull/784#discussion_r377400690 ## File path:

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object.

2020-02-10 Thread GitBox
rdblue commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object. URL: https://github.com/apache/incubator-iceberg/pull/784#discussion_r377401096 ## File path:

[GitHub] [incubator-iceberg] rdblue merged pull request #785: Overload TableMetadataParser read() method

2020-02-10 Thread GitBox
rdblue merged pull request #785: Overload TableMetadataParser read() method URL: https://github.com/apache/incubator-iceberg/pull/785 This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [incubator-iceberg] aokolnychyi commented on a change in pull request #789: Fix race condition in SnapshotProducer

2020-02-10 Thread GitBox
aokolnychyi commented on a change in pull request #789: Fix race condition in SnapshotProducer URL: https://github.com/apache/incubator-iceberg/pull/789#discussion_r377407115 ## File path: core/src/main/java/org/apache/iceberg/SnapshotProducer.java ## @@ -77,7 +77,7 @@

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #782: Incremental scan followups

2020-02-10 Thread GitBox
rdblue commented on a change in pull request #782: Incremental scan followups URL: https://github.com/apache/incubator-iceberg/pull/782#discussion_r377404270 ## File path: core/src/main/java/org/apache/iceberg/IncrementalDataTableScan.java ## @@ -137,4 +136,29 @@ protected

[GitHub] [incubator-iceberg] aokolnychyi commented on issue #789: Fix race condition in SnapshotProducer

2020-02-10 Thread GitBox
aokolnychyi commented on issue #789: Fix race condition in SnapshotProducer URL: https://github.com/apache/incubator-iceberg/pull/789#issuecomment-584436464 @rdblue, could you take a look? This is an automated message from

[GitHub] [incubator-iceberg] aokolnychyi opened a new pull request #789: Fix race condition in SnapshotProducer

2020-02-10 Thread GitBox
aokolnychyi opened a new pull request #789: Fix race condition in SnapshotProducer URL: https://github.com/apache/incubator-iceberg/pull/789 This PR fixes a race condition in `SnapshotProducer` while generating a new snapshot id. As it turns out, we can use different snapshot ids

[GitHub] [incubator-iceberg] rdblue commented on issue #789: Fix race condition in SnapshotProducer

2020-02-10 Thread GitBox
rdblue commented on issue #789: Fix race condition in SnapshotProducer URL: https://github.com/apache/incubator-iceberg/pull/789#issuecomment-584440148 I'm not sure I understand the cases in which `snapshotId()` is called concurrently. Shouldn't that be set lazily just once for any given

[GitHub] [incubator-iceberg] chenjunjiedada commented on a change in pull request #786: replace SparkDataFile with DataFile

2020-02-10 Thread GitBox
chenjunjiedada commented on a change in pull request #786: replace SparkDataFile with DataFile URL: https://github.com/apache/incubator-iceberg/pull/786#discussion_r377411005 ## File path: spark/src/main/scala/org/apache/iceberg/spark/SparkTableUtil.scala ## @@ -174,22

[GitHub] [incubator-iceberg] chenjunjiedada commented on a change in pull request #786: replace SparkDataFile with DataFile

2020-02-10 Thread GitBox
chenjunjiedada commented on a change in pull request #786: replace SparkDataFile with DataFile URL: https://github.com/apache/incubator-iceberg/pull/786#discussion_r377413188 ## File path: spark/src/main/scala/org/apache/iceberg/spark/SparkTableUtil.scala ## @@ -200,50

[GitHub] [incubator-iceberg] rdblue commented on issue #785: Overload TableMetadataParser read() method

2020-02-10 Thread GitBox
rdblue commented on issue #785: Overload TableMetadataParser read() method URL: https://github.com/apache/incubator-iceberg/pull/785#issuecomment-584424628 Looks great. Thanks @vrozov! This is an automated message from the

[GitHub] [incubator-iceberg] aokolnychyi commented on issue #786: replace SparkDataFile with DataFile

2020-02-10 Thread GitBox
aokolnychyi commented on issue #786: replace SparkDataFile with DataFile URL: https://github.com/apache/incubator-iceberg/pull/786#issuecomment-584435160 @chenjunjiedada, great work! I did a quick look and had only minor comments.

[GitHub] [incubator-iceberg] jun-he commented on issue #789: Fix race condition in SnapshotProducer

2020-02-10 Thread GitBox
jun-he commented on issue #789: Fix race condition in SnapshotProducer URL: https://github.com/apache/incubator-iceberg/pull/789#issuecomment-584507664 The race might happen in case that two threads call snapshotId at the same time when snapshot is not initialized. Then both pass the null

[GitHub] [incubator-iceberg] jun-he edited a comment on issue #789: Fix race condition in SnapshotProducer

2020-02-10 Thread GitBox
jun-he edited a comment on issue #789: Fix race condition in SnapshotProducer URL: https://github.com/apache/incubator-iceberg/pull/789#issuecomment-584507664 The race might happen in case that two threads call snapshotId at the same time when snapshot is not initialized. Then both pass

[GitHub] [incubator-iceberg] jun-he edited a comment on issue #789: Fix race condition in SnapshotProducer

2020-02-10 Thread GitBox
jun-he edited a comment on issue #789: Fix race condition in SnapshotProducer URL: https://github.com/apache/incubator-iceberg/pull/789#issuecomment-584507664 The race might happen in case that two threads call snapshotId at the same time when snapshot is not initialized. Then both pass

[GitHub] [incubator-iceberg] jun-he opened a new pull request #790: Improve getWriter in BaseRewriteManifests

2020-02-10 Thread GitBox
jun-he opened a new pull request #790: Improve getWriter in BaseRewriteManifests URL: https://github.com/apache/incubator-iceberg/pull/790 Here, the `writers` object is not volatile and the double checked locking might not be thread safe. Instead, use concurrent map's atomic

[GitHub] [incubator-iceberg] openinx opened a new issue #788: Integrate the Apache Flink into Apache Iceberg

2020-02-10 Thread GitBox
openinx opened a new issue #788: Integrate the Apache Flink into Apache Iceberg URL: https://github.com/apache/incubator-iceberg/issues/788 @rdblue we've had some discussion about integrating the apache flink into apache iceberg, Thanks for your helpful information. Here I written the