[GitHub] [incubator-iceberg] jun-he commented on issue #789: Fix race condition in SnapshotProducer

2020-02-11 Thread GitBox
jun-he commented on issue #789: Fix race condition in SnapshotProducer URL: https://github.com/apache/incubator-iceberg/pull/789#issuecomment-585077940 Another way to avoid double checked locking and test changes is to use lazy init. For example, declare `private final Supplier

[GitHub] [incubator-iceberg] TGooch44 opened a new pull request #793: Updating travis to use bionic

2020-02-11 Thread GitBox
TGooch44 opened a new pull request #793: Updating travis to use bionic URL: https://github.com/apache/incubator-iceberg/pull/793 Updating travis to use bionic instead of trusty. @chyzzqo2 do you think I should be updating this to use python 3.7 too? cc: @rdblue @danielcweeks

[GitHub] [incubator-iceberg] rdsr commented on issue #771: Can't register table to Hive

2020-02-11 Thread GitBox
rdsr commented on issue #771: Can't register table to Hive URL: https://github.com/apache/incubator-iceberg/issues/771#issuecomment-585049569 @CarreauClement for the line > val catalog = new HiveCatalog(spark.sparkContext.hadoopConfiguration) I'd double check if HiveCatalog is

[GitHub] [incubator-iceberg] jun-he edited a comment on issue #789: Fix race condition in SnapshotProducer

2020-02-11 Thread GitBox
jun-he edited a comment on issue #789: Fix race condition in SnapshotProducer URL: https://github.com/apache/incubator-iceberg/pull/789#issuecomment-585077940 Another way to avoid double checked locking and test changes is to use lazy init. For example, - declare `private final

[GitHub] [incubator-iceberg] rdblue commented on issue #745: schema evolution support

2020-02-11 Thread GitBox
rdblue commented on issue #745: schema evolution support URL: https://github.com/apache/incubator-iceberg/pull/745#issuecomment-584954078 @ravichinoy, thanks for working on this. It looks good. Can you either fix the methods that use a boxed Boolean or implement the builder-like API

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #745: schema evolution support

2020-02-11 Thread GitBox
rdblue commented on a change in pull request #745: schema evolution support URL: https://github.com/apache/incubator-iceberg/pull/745#discussion_r377986573 ## File path: spark/src/main/java/org/apache/iceberg/spark/source/IcebergSource.java ## @@ -189,12 +189,13 @@

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #745: schema evolution support

2020-02-11 Thread GitBox
rdblue commented on a change in pull request #745: schema evolution support URL: https://github.com/apache/incubator-iceberg/pull/745#discussion_r377985010 ## File path: api/src/main/java/org/apache/iceberg/types/CheckCompatibility.java ## @@ -39,7 +39,36 @@ * @return

[GitHub] [incubator-iceberg] vrozov commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object.

2020-02-11 Thread GitBox
vrozov commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object. URL: https://github.com/apache/incubator-iceberg/pull/784#discussion_r378004624 ## File path:

[GitHub] [incubator-iceberg] rdsr commented on a change in pull request #778: ORC: Implement TestGenericData and fix reader and writer issues

2020-02-11 Thread GitBox
rdsr commented on a change in pull request #778: ORC: Implement TestGenericData and fix reader and writer issues URL: https://github.com/apache/incubator-iceberg/pull/778#discussion_r378035121 ## File path: data/src/main/java/org/apache/iceberg/data/orc/GenericOrcReader.java

[GitHub] [incubator-iceberg] rdsr commented on a change in pull request #778: ORC: Implement TestGenericData and fix reader and writer issues

2020-02-11 Thread GitBox
rdsr commented on a change in pull request #778: ORC: Implement TestGenericData and fix reader and writer issues URL: https://github.com/apache/incubator-iceberg/pull/778#discussion_r378035836 ## File path: data/src/main/java/org/apache/iceberg/data/orc/GenericOrcReader.java

[GitHub] [incubator-iceberg] rdsr commented on a change in pull request #778: ORC: Implement TestGenericData and fix reader and writer issues

2020-02-11 Thread GitBox
rdsr commented on a change in pull request #778: ORC: Implement TestGenericData and fix reader and writer issues URL: https://github.com/apache/incubator-iceberg/pull/778#discussion_r378018524 ## File path: data/src/main/java/org/apache/iceberg/data/orc/GenericOrcWriter.java

[GitHub] [incubator-iceberg] rdsr commented on a change in pull request #778: ORC: Implement TestGenericData and fix reader and writer issues

2020-02-11 Thread GitBox
rdsr commented on a change in pull request #778: ORC: Implement TestGenericData and fix reader and writer issues URL: https://github.com/apache/incubator-iceberg/pull/778#discussion_r378039841 ## File path: data/src/main/java/org/apache/iceberg/data/orc/GenericOrcWriter.java

[GitHub] [incubator-iceberg] rdsr commented on a change in pull request #778: ORC: Implement TestGenericData and fix reader and writer issues

2020-02-11 Thread GitBox
rdsr commented on a change in pull request #778: ORC: Implement TestGenericData and fix reader and writer issues URL: https://github.com/apache/incubator-iceberg/pull/778#discussion_r378039888 ## File path: data/src/main/java/org/apache/iceberg/data/orc/GenericOrcWriter.java

[GitHub] [incubator-iceberg] chyzzqo2 commented on issue #793: Updating travis to use bionic

2020-02-11 Thread GitBox
chyzzqo2 commented on issue #793: Updating travis to use bionic URL: https://github.com/apache/incubator-iceberg/pull/793#issuecomment-584975942 python 3.6 is still pretty widely used so i think it's find to leave it there as the baseline. Unless of course you actually want / need 3.7 only

[GitHub] [incubator-iceberg] rdsr commented on issue #791: java.lang.RuntimeException: Metastore operation failed error reason

2020-02-11 Thread GitBox
rdsr commented on issue #791: java.lang.RuntimeException: Metastore operation failed error reason URL: https://github.com/apache/incubator-iceberg/issues/791#issuecomment-585044154 @figo10203, there could be an exception in your Metastore logs. We should also check that for clues.

[GitHub] [incubator-iceberg] vrozov commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object.

2020-02-11 Thread GitBox
vrozov commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object. URL: https://github.com/apache/incubator-iceberg/pull/784#discussion_r377980728 ## File path:

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object.

2020-02-11 Thread GitBox
rdblue commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object. URL: https://github.com/apache/incubator-iceberg/pull/784#discussion_r377983822 ## File path:

[GitHub] [incubator-iceberg] rdsr commented on issue #792: AvroSchemaUtil.toIceberg does not process logicalType during schema conversion

2020-02-11 Thread GitBox
rdsr commented on issue #792: AvroSchemaUtil.toIceberg does not process logicalType during schema conversion URL: https://github.com/apache/incubator-iceberg/issues/792#issuecomment-585038723 I tried out the above example. I think it is not working because there's a typo. The

[GitHub] [incubator-iceberg] vrozov commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object.

2020-02-11 Thread GitBox
vrozov commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object. URL: https://github.com/apache/incubator-iceberg/pull/784#discussion_r377660527 ## File path:

[GitHub] [incubator-iceberg] vrozov commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object.

2020-02-11 Thread GitBox
vrozov commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object. URL: https://github.com/apache/incubator-iceberg/pull/784#discussion_r377654499 ## File path:

[GitHub] [incubator-iceberg] aokolnychyi commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object.

2020-02-11 Thread GitBox
aokolnychyi commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object. URL: https://github.com/apache/incubator-iceberg/pull/784#discussion_r377731665 ## File path:

[GitHub] [incubator-iceberg] aokolnychyi commented on issue #789: Fix race condition in SnapshotProducer

2020-02-11 Thread GitBox
aokolnychyi commented on issue #789: Fix race condition in SnapshotProducer URL: https://github.com/apache/incubator-iceberg/pull/789#issuecomment-584711200 @jun-he is correct. As an example, consider `filterManifests` in `MergingSnapshotProducer`. It uses a thread pool to process each

[GitHub] [incubator-iceberg] aokolnychyi edited a comment on issue #789: Fix race condition in SnapshotProducer

2020-02-11 Thread GitBox
aokolnychyi edited a comment on issue #789: Fix race condition in SnapshotProducer URL: https://github.com/apache/incubator-iceberg/pull/789#issuecomment-584711200 @jun-he is correct. As an example, consider `filterManifests` in `MergingSnapshotProducer`. It uses a thread pool to process

[GitHub] [incubator-iceberg] aokolnychyi commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object.

2020-02-11 Thread GitBox
aokolnychyi commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object. URL: https://github.com/apache/incubator-iceberg/pull/784#discussion_r377731665 ## File path:

[GitHub] [incubator-iceberg] aokolnychyi commented on issue #789: Fix race condition in SnapshotProducer

2020-02-11 Thread GitBox
aokolnychyi commented on issue #789: Fix race condition in SnapshotProducer URL: https://github.com/apache/incubator-iceberg/pull/789#issuecomment-584729548 Actually, there is one consequence of moving the initialization of the snapshot id in the constructor that we should be careful

[GitHub] [incubator-iceberg] aokolnychyi edited a comment on issue #789: Fix race condition in SnapshotProducer

2020-02-11 Thread GitBox
aokolnychyi edited a comment on issue #789: Fix race condition in SnapshotProducer URL: https://github.com/apache/incubator-iceberg/pull/789#issuecomment-584729548 Actually, there is one consequence of moving the initialization of the snapshot id in the constructor that we should be

[GitHub] [incubator-iceberg] rdblue commented on issue #789: Fix race condition in SnapshotProducer

2020-02-11 Thread GitBox
rdblue commented on issue #789: Fix race condition in SnapshotProducer URL: https://github.com/apache/incubator-iceberg/pull/789#issuecomment-584808267 I'm okay with either solution. Seems easier to do the double-checked lock to avoid test changes, though.

[GitHub] [incubator-iceberg] aokolnychyi commented on issue #789: Fix race condition in SnapshotProducer

2020-02-11 Thread GitBox
aokolnychyi commented on issue #789: Fix race condition in SnapshotProducer URL: https://github.com/apache/incubator-iceberg/pull/789#issuecomment-584769254 Here is the list of tests to be adapted if we initialize in the constructor. ```

[GitHub] [incubator-iceberg] aokolnychyi commented on a change in pull request #588: [WIP] Add sequence number for supporting row level delete

2020-02-11 Thread GitBox
aokolnychyi commented on a change in pull request #588: [WIP] Add sequence number for supporting row level delete URL: https://github.com/apache/incubator-iceberg/pull/588#discussion_r377842072 ## File path: core/src/main/java/org/apache/iceberg/BaseSnapshot.java ## @@

[GitHub] [incubator-iceberg] rdblue commented on issue #790: Improve getWriter in BaseRewriteManifests

2020-02-11 Thread GitBox
rdblue commented on issue #790: Improve getWriter in BaseRewriteManifests URL: https://github.com/apache/incubator-iceberg/pull/790#issuecomment-584810714 The value of `writers` doesn't change across threads. It is always the same concurrent map, so it doesn't matter that the variable

[GitHub] [incubator-iceberg] rdblue merged pull request #789: Fix race condition in SnapshotProducer

2020-02-11 Thread GitBox
rdblue merged pull request #789: Fix race condition in SnapshotProducer URL: https://github.com/apache/incubator-iceberg/pull/789 This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [incubator-iceberg] rdblue commented on issue #789: Fix race condition in SnapshotProducer

2020-02-11 Thread GitBox
rdblue commented on issue #789: Fix race condition in SnapshotProducer URL: https://github.com/apache/incubator-iceberg/pull/789#issuecomment-584817094 Sounds good. I merged this. The test failure was Python and Ted is looking into it.

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object.

2020-02-11 Thread GitBox
rdblue commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object. URL: https://github.com/apache/incubator-iceberg/pull/784#discussion_r377854008 ## File path:

[GitHub] [incubator-iceberg] aokolnychyi commented on issue #789: Fix race condition in SnapshotProducer

2020-02-11 Thread GitBox
aokolnychyi commented on issue #789: Fix race condition in SnapshotProducer URL: https://github.com/apache/incubator-iceberg/pull/789#issuecomment-584815363 I propose to keep it as is to avoid additional changes and any hidden consequences.

[GitHub] [incubator-iceberg] rdblue commented on issue #782: Incremental scan followups

2020-02-11 Thread GitBox
rdblue commented on issue #782: Incremental scan followups URL: https://github.com/apache/incubator-iceberg/pull/782#issuecomment-584817879 +1 I'll merge this with the test failure because those are Python and aren't related.

[GitHub] [incubator-iceberg] rdblue merged pull request #782: Incremental scan followups

2020-02-11 Thread GitBox
rdblue merged pull request #782: Incremental scan followups URL: https://github.com/apache/incubator-iceberg/pull/782 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [incubator-iceberg] rdblue commented on issue #788: Integrate the Apache Flink into Apache Iceberg

2020-02-11 Thread GitBox
rdblue commented on issue #788: Integrate the Apache Flink into Apache Iceberg URL: https://github.com/apache/incubator-iceberg/issues/788#issuecomment-584909577 @openinx, I want to make this work. As I said on the doc, we want Iceberg to be a great at-rest data layer. Supporting

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object.

2020-02-11 Thread GitBox
rdblue commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object. URL: https://github.com/apache/incubator-iceberg/pull/784#discussion_r377962177 ## File path:

[GitHub] [incubator-iceberg] figo10203 opened a new issue #791: java.lang.RuntimeException: Metastore operation failed error reason

2020-02-11 Thread GitBox
figo10203 opened a new issue #791: java.lang.RuntimeException: Metastore operation failed error reason URL: https://github.com/apache/incubator-iceberg/issues/791 Hi guys, I'm new to apache iceberg. When I want to create a table with hive like below. val catalog = new

[GitHub] [incubator-iceberg] sudssf opened a new issue #792: AvroSchemaUtil.toIceberg does not process logicalType during schema conversion

2020-02-11 Thread GitBox
sudssf opened a new issue #792: AvroSchemaUtil.toIceberg does not process logicalType during schema conversion URL: https://github.com/apache/incubator-iceberg/issues/792 Following is sample tests which shows that schema conversion from avro does not reflect logical type from avro schema.