[GitHub] [incubator-iceberg] chenjunjiedada commented on issue #457: DataFrame generated by Seq() might have schema conflict with Iceberg

2019-10-10 Thread GitBox
chenjunjiedada commented on issue #457: DataFrame generated by Seq() might have schema conflict with Iceberg URL: https://github.com/apache/incubator-iceberg/issues/457#issuecomment-540431741 Found more discussion in #510

[GitHub] [incubator-iceberg] TGooch44 commented on issue #530: [python] adding Hive package to wrap BaseMetastoreTables/TableOperations

2019-10-10 Thread GitBox
TGooch44 commented on issue #530: [python] adding Hive package to wrap BaseMetastoreTables/TableOperations URL: https://github.com/apache/incubator-iceberg/pull/530#issuecomment-540549067 @rdblue @danielcweeks Can you take a look and see if this looks good? If we can get this merged,

[GitHub] [incubator-iceberg] chenjunjiedada opened a new pull request #529: add catalog hadoop table(WIP)

2019-10-10 Thread GitBox
chenjunjiedada opened a new pull request #529: add catalog hadoop table(WIP) URL: https://github.com/apache/incubator-iceberg/pull/529 This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [incubator-iceberg] TGooch44 opened a new pull request #530: [python] adding Hive package to wrap BaseMetastoreTables/TableOperations

2019-10-10 Thread GitBox
TGooch44 opened a new pull request #530: [python] adding Hive package to wrap BaseMetastoreTables/TableOperations URL: https://github.com/apache/incubator-iceberg/pull/530 This adds a package for connecting to hive metastore to manage iceberg tables. Also, fixes one minor lint issue.

[GitHub] [incubator-iceberg] aokolnychyi commented on issue #512: Extend RewriteManifests with a way to add/delete manifests

2019-10-10 Thread GitBox
aokolnychyi commented on issue #512: Extend RewriteManifests with a way to add/delete manifests URL: https://github.com/apache/incubator-iceberg/pull/512#issuecomment-540674743 @rdblue I am not sure changing the status of entries from `ADDED` to `EXISTING` while copying manifests would

[GitHub] [incubator-iceberg] rdblue opened a new pull request #532: Avoid NullPointerException when a source column is missing

2019-10-10 Thread GitBox
rdblue opened a new pull request #532: Avoid NullPointerException when a source column is missing URL: https://github.com/apache/incubator-iceberg/pull/532 This adds a check for whether a source column was found for a given ID when validating a partition spec. Validation is already done

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #530: [python] adding Hive package to wrap BaseMetastoreTables/TableOperations

2019-10-10 Thread GitBox
rdblue commented on a change in pull request #530: [python] adding Hive package to wrap BaseMetastoreTables/TableOperations URL: https://github.com/apache/incubator-iceberg/pull/530#discussion_r333649667 ## File path: python/iceberg/hive/hive_table_operations.py ## @@

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #530: [python] adding Hive package to wrap BaseMetastoreTables/TableOperations

2019-10-10 Thread GitBox
rdblue commented on a change in pull request #530: [python] adding Hive package to wrap BaseMetastoreTables/TableOperations URL: https://github.com/apache/incubator-iceberg/pull/530#discussion_r333650024 ## File path: python/iceberg/hive/hive_table_operations.py ## @@

[GitHub] [incubator-iceberg] rdblue commented on issue #497: Support retaining last N snapshots

2019-10-10 Thread GitBox
rdblue commented on issue #497: Support retaining last N snapshots URL: https://github.com/apache/incubator-iceberg/pull/497#issuecomment-540674402 @aokolnychyi, yes. The behavior I'm suggesting would keep the 10 latest snapshots, no matter what the expiration time is. For these

[GitHub] [incubator-iceberg] rdblue opened a new pull request #533: Update Jackson to 2.9.10 for CVE-2019-14379

2019-10-10 Thread GitBox
rdblue opened a new pull request #533: Update Jackson to 2.9.10 for CVE-2019-14379 URL: https://github.com/apache/incubator-iceberg/pull/533 This is an automated message from the Apache Git Service. To respond to the

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #530: [python] adding Hive package to wrap BaseMetastoreTables/TableOperations

2019-10-10 Thread GitBox
rdblue commented on a change in pull request #530: [python] adding Hive package to wrap BaseMetastoreTables/TableOperations URL: https://github.com/apache/incubator-iceberg/pull/530#discussion_r333650333 ## File path: python/iceberg/hive/hive_table_operations.py ## @@

[GitHub] [incubator-iceberg] rdblue merged pull request #532: Avoid NullPointerException when a source column is missing

2019-10-10 Thread GitBox
rdblue merged pull request #532: Avoid NullPointerException when a source column is missing URL: https://github.com/apache/incubator-iceberg/pull/532 This is an automated message from the Apache Git Service. To respond to

[GitHub] [incubator-iceberg] rdblue opened a new pull request #531: Update build for Apache releases

2019-10-10 Thread GitBox
rdblue opened a new pull request #531: Update build for Apache releases URL: https://github.com/apache/incubator-iceberg/pull/531 * Add Apache publication * Add source, javadoc, and test artifacts with signatures to publication * Add Apache snapshot and release repositories * Add

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #531: Update build for Apache releases

2019-10-10 Thread GitBox
rdblue commented on a change in pull request #531: Update build for Apache releases URL: https://github.com/apache/incubator-iceberg/pull/531#discussion_r333638989 ## File path: api/src/main/java/org/apache/iceberg/PartitionSpec.java ## @@ -482,6 +482,8 @@ public

[GitHub] [incubator-iceberg] aokolnychyi commented on issue #497: Support retaining last N snapshots

2019-10-10 Thread GitBox
aokolnychyi commented on issue #497: Support retaining last N snapshots URL: https://github.com/apache/incubator-iceberg/pull/497#issuecomment-540643121 While onboarding some users, I realized that they sometimes struggle to find a timestamp they can use for expiration even though they

[GitHub] [incubator-iceberg] rdblue merged pull request #531: Update build for Apache releases

2019-10-10 Thread GitBox
rdblue merged pull request #531: Update build for Apache releases URL: https://github.com/apache/incubator-iceberg/pull/531 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [incubator-iceberg] aokolnychyi commented on issue #497: Support retaining last N snapshots

2019-10-10 Thread GitBox
aokolnychyi commented on issue #497: Support retaining last N snapshots URL: https://github.com/apache/incubator-iceberg/pull/497#issuecomment-540689319 Not confident yet, but hour-based retention might make sense for some use cases. I'd vote for having two table properties: min retention

[GitHub] [incubator-iceberg] xabriel commented on a change in pull request #531: Update build for Apache releases

2019-10-10 Thread GitBox
xabriel commented on a change in pull request #531: Update build for Apache releases URL: https://github.com/apache/incubator-iceberg/pull/531#discussion_r333632793 ## File path: api/src/main/java/org/apache/iceberg/PartitionSpec.java ## @@ -482,6 +482,8 @@ public

[GitHub] [incubator-iceberg] rdblue opened a new pull request #534: Remove version.txt accidentally in release update PR

2019-10-10 Thread GitBox
rdblue opened a new pull request #534: Remove version.txt accidentally in release update PR URL: https://github.com/apache/incubator-iceberg/pull/534 This is an automated message from the Apache Git Service. To respond to

[GitHub] [incubator-iceberg] manishmalhotrawork commented on a change in pull request #524: respect commit.manifest.min.count

2019-10-10 Thread GitBox
manishmalhotrawork commented on a change in pull request #524: respect commit.manifest.min.count URL: https://github.com/apache/incubator-iceberg/pull/524#discussion_r333674855 ## File path: core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java ## @@ -595,6

[GitHub] [incubator-iceberg] rdblue opened a new pull request #536: Add How to Release docs

2019-10-10 Thread GitBox
rdblue opened a new pull request #536: Add How to Release docs URL: https://github.com/apache/incubator-iceberg/pull/536 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #227: ORC column map fix

2019-10-10 Thread GitBox
rdblue commented on a change in pull request #227: ORC column map fix URL: https://github.com/apache/incubator-iceberg/pull/227#discussion_r333771237 ## File path: orc/src/main/java/org/apache/iceberg/orc/ORCSchemaUtil.java ## @@ -0,0 +1,412 @@ +/* + * Licensed to the

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #207: Add external schema mappings for files written with name-based schemas #40

2019-10-10 Thread GitBox
rdblue commented on a change in pull request #207: Add external schema mappings for files written with name-based schemas #40 URL: https://github.com/apache/incubator-iceberg/pull/207#discussion_r333774509 ## File path:

[GitHub] [incubator-iceberg] manishmalhotrawork commented on issue #499: Add persistent IDs to partition fields (WIP)

2019-10-10 Thread GitBox
manishmalhotrawork commented on issue #499: Add persistent IDs to partition fields (WIP) URL: https://github.com/apache/incubator-iceberg/pull/499#issuecomment-540839338 >> this Id will be used as the init Id if new PartitionSpec is added > This is assigning an ID. Those IDs should

[GitHub] [incubator-iceberg] edgarRd commented on a change in pull request #227: ORC column map fix

2019-10-10 Thread GitBox
edgarRd commented on a change in pull request #227: ORC column map fix URL: https://github.com/apache/incubator-iceberg/pull/227#discussion_r333787809 ## File path: orc/src/main/java/org/apache/iceberg/orc/ORCSchemaUtil.java ## @@ -0,0 +1,412 @@ +/* + * Licensed to the

[GitHub] [incubator-iceberg] manishmalhotrawork edited a comment on issue #499: Add persistent IDs to partition fields (WIP)

2019-10-10 Thread GitBox
manishmalhotrawork edited a comment on issue #499: Add persistent IDs to partition fields (WIP) URL: https://github.com/apache/incubator-iceberg/pull/499#issuecomment-540821308 @rdblue thanks. >It looks like this is trying to assign the same IDs for a spec each time it is created,

[GitHub] [incubator-iceberg] manishmalhotrawork commented on issue #499: Add persistent IDs to partition fields (WIP)

2019-10-10 Thread GitBox
manishmalhotrawork commented on issue #499: Add persistent IDs to partition fields (WIP) URL: https://github.com/apache/incubator-iceberg/pull/499#issuecomment-540821308 @rdblue thanks. >It looks like this is trying to assign the same IDs for a spec each time it is created, but I

[GitHub] [incubator-iceberg] rdblue commented on issue #499: Add persistent IDs to partition fields (WIP)

2019-10-10 Thread GitBox
rdblue commented on issue #499: Add persistent IDs to partition fields (WIP) URL: https://github.com/apache/incubator-iceberg/pull/499#issuecomment-540824007 > this Id will be used as the init Id if new PartitionSpec is added

[GitHub] [incubator-iceberg] edgarRd commented on a change in pull request #227: ORC column map fix

2019-10-10 Thread GitBox
edgarRd commented on a change in pull request #227: ORC column map fix URL: https://github.com/apache/incubator-iceberg/pull/227#discussion_r333786425 ## File path: orc/src/main/java/org/apache/iceberg/orc/ORCSchemaUtil.java ## @@ -0,0 +1,412 @@ +/* + * Licensed to the

[GitHub] [incubator-iceberg] yathindranath commented on a change in pull request #497: Support retaining last N snapshots

2019-10-10 Thread GitBox
yathindranath commented on a change in pull request #497: Support retaining last N snapshots URL: https://github.com/apache/incubator-iceberg/pull/497#discussion_r333790897 ## File path: core/src/main/java/org/apache/iceberg/RemoveSnapshots.java ## @@ -77,8 +79,34 @@

[GitHub] [incubator-iceberg] rdblue commented on issue #491: Use relative path for manifest_path and file_path

2019-10-10 Thread GitBox
rdblue commented on issue #491: Use relative path for manifest_path and file_path URL: https://github.com/apache/incubator-iceberg/pull/491#issuecomment-540815248 I'm not sure about paths relative to manifest locations. That sounds complex to reason about and manage to me. What is

[GitHub] [incubator-iceberg] rdsr commented on a change in pull request #207: Add external schema mappings for files written with name-based schemas #40

2019-10-10 Thread GitBox
rdsr commented on a change in pull request #207: Add external schema mappings for files written with name-based schemas #40 URL: https://github.com/apache/incubator-iceberg/pull/207#discussion_r333785084 ## File path:

[GitHub] [incubator-iceberg] rdblue commented on issue #499: Add persistent IDs to partition fields (WIP)

2019-10-10 Thread GitBox
rdblue commented on issue #499: Add persistent IDs to partition fields (WIP) URL: https://github.com/apache/incubator-iceberg/pull/499#issuecomment-540848516 Yeah, the first step is to add a partition field ID in addition to the existing source field ID.

[GitHub] [incubator-iceberg] rdblue commented on issue #467: Change method signature of ManifestReader::read (#459)

2019-10-10 Thread GitBox
rdblue commented on issue #467: Change method signature of ManifestReader::read (#459) URL: https://github.com/apache/incubator-iceberg/pull/467#issuecomment-540848300 This looks reasonable to me. @aokolnychyi, any comments?

[GitHub] [incubator-iceberg] edgarRd commented on a change in pull request #227: ORC column map fix

2019-10-10 Thread GitBox
edgarRd commented on a change in pull request #227: ORC column map fix URL: https://github.com/apache/incubator-iceberg/pull/227#discussion_r333786764 ## File path: orc/src/main/java/org/apache/iceberg/orc/ORCSchemaUtil.java ## @@ -0,0 +1,412 @@ +/* + * Licensed to the

[GitHub] [incubator-iceberg] rdblue merged pull request #534: Remove version.txt accidentally in release update PR

2019-10-10 Thread GitBox
rdblue merged pull request #534: Remove version.txt accidentally in release update PR URL: https://github.com/apache/incubator-iceberg/pull/534 This is an automated message from the Apache Git Service. To respond to the

[GitHub] [incubator-iceberg] TGooch44 commented on a change in pull request #530: [python] adding Hive package to wrap BaseMetastoreTables/TableOperations

2019-10-10 Thread GitBox
TGooch44 commented on a change in pull request #530: [python] adding Hive package to wrap BaseMetastoreTables/TableOperations URL: https://github.com/apache/incubator-iceberg/pull/530#discussion_r333669663 ## File path: python/iceberg/hive/hive_table_operations.py ## @@

[GitHub] [incubator-iceberg] mccheah commented on issue #535: Update Jackson to 2.9.10 for CVE-2019-14379

2019-10-10 Thread GitBox
mccheah commented on issue #535: Update Jackson to 2.9.10 for CVE-2019-14379 URL: https://github.com/apache/incubator-iceberg/pull/535#issuecomment-540716243 Attempt at fixing https://github.com/apache/incubator-iceberg/pull/533 versioning stuff

[GitHub] [incubator-iceberg] mccheah opened a new pull request #535: Update Jackson to 2.9.10 for CVE-2019-14379

2019-10-10 Thread GitBox
mccheah opened a new pull request #535: Update Jackson to 2.9.10 for CVE-2019-14379 URL: https://github.com/apache/incubator-iceberg/pull/535 This is an automated message from the Apache Git Service. To respond to the

[GitHub] [incubator-iceberg] rdblue commented on issue #525: Apply Baseline to iceberg-pig

2019-10-10 Thread GitBox
rdblue commented on issue #525: Apply Baseline to iceberg-pig URL: https://github.com/apache/incubator-iceberg/pull/525#issuecomment-540710884 Thanks for working on these, @Fokko! Great to see you in this community as well as Avro and Parquet!

[GitHub] [incubator-iceberg] rdsr commented on a change in pull request #526: Add Baseline to iceberg-parquet

2019-10-10 Thread GitBox
rdsr commented on a change in pull request #526: Add Baseline to iceberg-parquet URL: https://github.com/apache/incubator-iceberg/pull/526#discussion_r03586 ## File path: build.gradle ## @@ -165,7 +165,7 @@ task deploySite(type: Exec) { // Baseline style guide. def

[GitHub] [incubator-iceberg] edgarRd commented on a change in pull request #227: ORC column map fix

2019-10-10 Thread GitBox
edgarRd commented on a change in pull request #227: ORC column map fix URL: https://github.com/apache/incubator-iceberg/pull/227#discussion_r333745493 ## File path: orc/src/main/java/org/apache/iceberg/orc/ORCSchemaUtil.java ## @@ -0,0 +1,412 @@ +/* + * Licensed to the

[GitHub] [incubator-iceberg] rdsr commented on a change in pull request #526: Add Baseline to iceberg-parquet

2019-10-10 Thread GitBox
rdsr commented on a change in pull request #526: Add Baseline to iceberg-parquet URL: https://github.com/apache/incubator-iceberg/pull/526#discussion_r03586 ## File path: build.gradle ## @@ -165,7 +165,7 @@ task deploySite(type: Exec) { // Baseline style guide. def

[GitHub] [incubator-iceberg] rdblue merged pull request #535: Update Jackson to 2.9.10 for CVE-2019-14379

2019-10-10 Thread GitBox
rdblue merged pull request #535: Update Jackson to 2.9.10 for CVE-2019-14379 URL: https://github.com/apache/incubator-iceberg/pull/535 This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [incubator-iceberg] rdblue commented on issue #535: Update Jackson to 2.9.10 for CVE-2019-14379

2019-10-10 Thread GitBox
rdblue commented on issue #535: Update Jackson to 2.9.10 for CVE-2019-14379 URL: https://github.com/apache/incubator-iceberg/pull/535#issuecomment-540738554 Thanks for fixing this, @mccheah! This is an automated message from

[GitHub] [incubator-iceberg] rdblue commented on issue #533: Update Jackson to 2.9.10 for CVE-2019-14379

2019-10-10 Thread GitBox
rdblue commented on issue #533: Update Jackson to 2.9.10 for CVE-2019-14379 URL: https://github.com/apache/incubator-iceberg/pull/533#issuecomment-540738857 Fixed by #535 This is an automated message from the Apache Git

[GitHub] [incubator-iceberg] rdblue closed pull request #533: Update Jackson to 2.9.10 for CVE-2019-14379

2019-10-10 Thread GitBox
rdblue closed pull request #533: Update Jackson to 2.9.10 for CVE-2019-14379 URL: https://github.com/apache/incubator-iceberg/pull/533 This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [incubator-iceberg] edgarRd commented on a change in pull request #227: ORC column map fix

2019-10-10 Thread GitBox
edgarRd commented on a change in pull request #227: ORC column map fix URL: https://github.com/apache/incubator-iceberg/pull/227#discussion_r333739073 ## File path: orc/src/main/java/org/apache/iceberg/orc/ORCSchemaUtil.java ## @@ -0,0 +1,412 @@ +/* + * Licensed to the

[GitHub] [incubator-iceberg] yathindranath commented on issue #497: Support retaining last N snapshots

2019-10-10 Thread GitBox
yathindranath commented on issue #497: Support retaining last N snapshots URL: https://github.com/apache/incubator-iceberg/pull/497#issuecomment-540860262 @rdblue @aokolnychyi what about the below approach? Use just one table property, may be

[GitHub] [incubator-iceberg] bztsai opened a new pull request #537: Docs: Fix typos

2019-10-10 Thread GitBox
bztsai opened a new pull request #537: Docs: Fix typos URL: https://github.com/apache/incubator-iceberg/pull/537 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [incubator-iceberg] rdsr commented on a change in pull request #207: Add external schema mappings for files written with name-based schemas #40

2019-10-10 Thread GitBox
rdsr commented on a change in pull request #207: Add external schema mappings for files written with name-based schemas #40 URL: https://github.com/apache/incubator-iceberg/pull/207#discussion_r333804206 ## File path: