[GitHub] [incubator-iceberg] aokolnychyi commented on issue #387: Optimize overwrite and delete commits.

2019-08-15 Thread GitBox
aokolnychyi commented on issue #387: Optimize overwrite and delete commits. URL: https://github.com/apache/incubator-iceberg/pull/387#issuecomment-521615443 I think it is a substantial feature, can we add a couple of tests?

[GitHub] [incubator-iceberg] aokolnychyi commented on issue #387: Optimize overwrite and delete commits.

2019-08-15 Thread GitBox
aokolnychyi commented on issue #387: Optimize overwrite and delete commits. URL: https://github.com/apache/incubator-iceberg/pull/387#issuecomment-521615928 Do we want to overload `deleteFile(DataFile)` in `StreamingDelete` to use this functionality? Right now, it will call

[GitHub] [incubator-iceberg] aokolnychyi commented on a change in pull request #362: Support create and replace transactions in Catalog

2019-08-15 Thread GitBox
aokolnychyi commented on a change in pull request #362: Support create and replace transactions in Catalog URL: https://github.com/apache/incubator-iceberg/pull/362#discussion_r314315329 ## File path: hive/src/test/java/org/apache/iceberg/hive/HiveCreateReplaceTableTest.java

[GitHub] [incubator-iceberg] aokolnychyi commented on a change in pull request #362: Support create and replace transactions in Catalog

2019-08-15 Thread GitBox
aokolnychyi commented on a change in pull request #362: Support create and replace transactions in Catalog URL: https://github.com/apache/incubator-iceberg/pull/362#discussion_r314314329 ## File path: api/src/main/java/org/apache/iceberg/catalog/Catalog.java ## @@ -97,6

[GitHub] [incubator-iceberg] jerryshao commented on issue #370: Iceberg failed to work with embedded metastore, which is by default in Spark

2019-08-15 Thread GitBox
jerryshao commented on issue #370: Iceberg failed to work with embedded metastore, which is by default in Spark URL: https://github.com/apache/incubator-iceberg/issues/370#issuecomment-521614634 Working on it, will submit a PR.

[GitHub] [incubator-iceberg] aokolnychyi commented on issue #387: Optimize overwrite and delete commits.

2019-08-15 Thread GitBox
aokolnychyi commented on issue #387: Optimize overwrite and delete commits. URL: https://github.com/apache/incubator-iceberg/pull/387#issuecomment-521618913 Once this is merged, I'll update #351 This is an automated message

[GitHub] [incubator-iceberg] aokolnychyi commented on a change in pull request #351: Extend Iceberg with a way to overwrite files for eager updates/deletes

2019-08-15 Thread GitBox
aokolnychyi commented on a change in pull request #351: Extend Iceberg with a way to overwrite files for eager updates/deletes URL: https://github.com/apache/incubator-iceberg/pull/351#discussion_r314234844 ## File path: api/src/main/java/org/apache/iceberg/OverwriteFiles.java

[GitHub] [incubator-iceberg] aokolnychyi commented on a change in pull request #351: Extend Iceberg with a way to overwrite files for eager updates/deletes

2019-08-15 Thread GitBox
aokolnychyi commented on a change in pull request #351: Extend Iceberg with a way to overwrite files for eager updates/deletes URL: https://github.com/apache/incubator-iceberg/pull/351#discussion_r314233769 ## File path: core/src/main/java/org/apache/iceberg/OverwriteData.java

[GitHub] [incubator-iceberg] aokolnychyi commented on a change in pull request #351: Extend Iceberg with a way to overwrite files for eager updates/deletes

2019-08-15 Thread GitBox
aokolnychyi commented on a change in pull request #351: Extend Iceberg with a way to overwrite files for eager updates/deletes URL: https://github.com/apache/incubator-iceberg/pull/351#discussion_r314236321 ## File path:

[GitHub] [incubator-iceberg] aokolnychyi commented on a change in pull request #351: Extend Iceberg with a way to overwrite files for eager updates/deletes

2019-08-15 Thread GitBox
aokolnychyi commented on a change in pull request #351: Extend Iceberg with a way to overwrite files for eager updates/deletes URL: https://github.com/apache/incubator-iceberg/pull/351#discussion_r314236135 ## File path: api/src/main/java/org/apache/iceberg/OverwriteFiles.java

[GitHub] [incubator-iceberg] aokolnychyi commented on issue #389: Add test cases

2019-08-15 Thread GitBox
aokolnychyi commented on issue #389: Add test cases URL: https://github.com/apache/incubator-iceberg/pull/389#issuecomment-521594956 I actually didn't know we have `ScanSummary`. Do you use it for debugging purposes? This is

[GitHub] [incubator-iceberg] aokolnychyi commented on issue #351: Extend Iceberg with a way to overwrite files for eager updates/deletes

2019-08-15 Thread GitBox
aokolnychyi commented on issue #351: Extend Iceberg with a way to overwrite files for eager updates/deletes URL: https://github.com/apache/incubator-iceberg/pull/351#issuecomment-521580911 @rdblue What is our story on case sensitivity for metrics evaluators? We respect it in

[GitHub] [incubator-iceberg] rdblue merged pull request #384: Add BaseCombinedScanTask.toString

2019-08-15 Thread GitBox
rdblue merged pull request #384: Add BaseCombinedScanTask.toString URL: https://github.com/apache/incubator-iceberg/pull/384 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [incubator-iceberg] rdblue merged pull request #385: Make original ManifestReader factory method public

2019-08-15 Thread GitBox
rdblue merged pull request #385: Make original ManifestReader factory method public URL: https://github.com/apache/incubator-iceberg/pull/385 This is an automated message from the Apache Git Service. To respond to the

[GitHub] [incubator-iceberg] linxingyuan1102 commented on issue #227: ORC column map fix

2019-08-15 Thread GitBox
linxingyuan1102 commented on issue #227: ORC column map fix URL: https://github.com/apache/incubator-iceberg/pull/227#issuecomment-521736621 FYI, I prototyped ID-based column mapping for Presto in prestosql/presto#1290, using the type annotations introduced to ORC. So the type annotation

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #374: Migrate spark table to iceberg table

2019-08-15 Thread GitBox
rdblue commented on a change in pull request #374: Migrate spark table to iceberg table URL: https://github.com/apache/incubator-iceberg/pull/374#discussion_r314399396 ## File path: spark/src/main/scala/org/apache/iceberg/spark/SparkTableUtil.scala ## @@ -297,5 +301,88 @@

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #374: Migrate spark table to iceberg table

2019-08-15 Thread GitBox
rdblue commented on a change in pull request #374: Migrate spark table to iceberg table URL: https://github.com/apache/incubator-iceberg/pull/374#discussion_r314399797 ## File path: spark/src/main/scala/org/apache/iceberg/spark/SparkTableUtil.scala ## @@ -297,5 +301,88 @@

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #374: Migrate spark table to iceberg table

2019-08-15 Thread GitBox
rdblue commented on a change in pull request #374: Migrate spark table to iceberg table URL: https://github.com/apache/incubator-iceberg/pull/374#discussion_r314399700 ## File path: spark/src/main/scala/org/apache/iceberg/spark/SparkTableUtil.scala ## @@ -297,5 +301,88 @@

[GitHub] [incubator-iceberg] rdblue merged pull request #389: Add test cases

2019-08-15 Thread GitBox
rdblue merged pull request #389: Add test cases URL: https://github.com/apache/incubator-iceberg/pull/389 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [incubator-iceberg] rdblue commented on issue #387: Optimize overwrite and delete commits.

2019-08-15 Thread GitBox
rdblue commented on issue #387: Optimize overwrite and delete commits. URL: https://github.com/apache/incubator-iceberg/pull/387#issuecomment-521700796 @aokolnychyi, the existing tests cover correctness. This just speeds up the operations by ignoring manifests that can't match.

[GitHub] [incubator-iceberg] rdblue commented on issue #227: ORC column map fix

2019-08-15 Thread GitBox
rdblue commented on issue #227: ORC column map fix URL: https://github.com/apache/incubator-iceberg/pull/227#issuecomment-521738363 Thanks for the update, @linxingyuan1102. It would be great to have your help getting the spec updated. In the end, I think Presto should depend on Iceberg to

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #351: Extend Iceberg with a way to overwrite files for eager updates/deletes

2019-08-15 Thread GitBox
rdblue commented on a change in pull request #351: Extend Iceberg with a way to overwrite files for eager updates/deletes URL: https://github.com/apache/incubator-iceberg/pull/351#discussion_r314385635 ## File path: core/src/main/java/org/apache/iceberg/OverwriteData.java

[GitHub] [incubator-iceberg] rdblue commented on issue #389: Add test cases

2019-08-15 Thread GitBox
rdblue commented on issue #389: Add test cases URL: https://github.com/apache/incubator-iceberg/pull/389#issuecomment-521700270 We use `ScanSummary` to provide partition-level summaries from our metastore. It's something that we could have put elsewhere, but I thought it was useful enough

[GitHub] [incubator-iceberg] rdblue commented on issue #351: Extend Iceberg with a way to overwrite files for eager updates/deletes

2019-08-15 Thread GitBox
rdblue commented on issue #351: Extend Iceberg with a way to overwrite files for eager updates/deletes URL: https://github.com/apache/incubator-iceberg/pull/351#issuecomment-521699751 I guess we should add a method to configure case sensitivity. We could also add a boolean flag to the

[GitHub] [incubator-iceberg] rdblue merged pull request #377: Add FindFiles helper API

2019-08-15 Thread GitBox
rdblue merged pull request #377: Add FindFiles helper API URL: https://github.com/apache/incubator-iceberg/pull/377 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [incubator-iceberg] rdblue commented on issue #377: Add FindFiles helper API

2019-08-15 Thread GitBox
rdblue commented on issue #377: Add FindFiles helper API URL: https://github.com/apache/incubator-iceberg/pull/377#issuecomment-521710811 Merging this. Thanks for reviewing, @xabriel! This is an automated message from the

[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #374: Migrate spark table to iceberg table

2019-08-15 Thread GitBox
rdblue commented on a change in pull request #374: Migrate spark table to iceberg table URL: https://github.com/apache/incubator-iceberg/pull/374#discussion_r314397922 ## File path: spark/src/main/scala/org/apache/iceberg/spark/SparkTableUtil.scala ## @@ -297,5 +301,88 @@

[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314505113 ## File path: api/src/main/java/org/apache/iceberg/expressions/LiteralSet.java ## @@ -0,0 +1,212 @@ +/* + *

[GitHub] [incubator-iceberg] aokolnychyi commented on issue #393: include column_sizes in stats columns

2019-08-15 Thread GitBox
aokolnychyi commented on issue #393: include column_sizes in stats columns URL: https://github.com/apache/incubator-iceberg/pull/393#issuecomment-521808846 LGTM. Thanks, @manishmalhotrawork! This is an automated message from

[GitHub] [incubator-iceberg] manishmalhotrawork opened a new pull request #393: include column_sizes in stats columns

2019-08-15 Thread GitBox
manishmalhotrawork opened a new pull request #393: include column_sizes in stats columns URL: https://github.com/apache/incubator-iceberg/pull/393 this is for Issue #269 My understand is that `column_sizes` was calculated by `ParquetUtil.footerMetrics` or `ParquetUtil.fileMetrics`

[GitHub] [incubator-iceberg] aokolnychyi commented on issue #351: Extend Iceberg with a way to overwrite files for eager updates/deletes

2019-08-15 Thread GitBox
aokolnychyi commented on issue #351: Extend Iceberg with a way to overwrite files for eager updates/deletes URL: https://github.com/apache/incubator-iceberg/pull/351#issuecomment-521755450 My first guess was to use a separate method as it is probably more descriptive than passing a

[GitHub] [incubator-iceberg] edgarRd commented on issue #227: ORC column map fix

2019-08-15 Thread GitBox
edgarRd commented on issue #227: ORC column map fix URL: https://github.com/apache/incubator-iceberg/pull/227#issuecomment-521789477 @linxingyuan1102 I agree, we need to specify the annotations for ORC column mapping in the spec. I have a proposal in

[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314584436 ## File path: api/src/main/java/org/apache/iceberg/expressions/LiteralSet.java ## @@ -0,0 +1,212 @@ +/* + *

[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314583939 ## File path: api/src/main/java/org/apache/iceberg/expressions/LiteralSet.java ## @@ -0,0 +1,212 @@ +/* + *

[GitHub] [incubator-iceberg] waterlx opened a new issue #394: Wrong URL of "Iceberg API" on Spark user docs

2019-08-15 Thread GitBox
waterlx opened a new issue #394: Wrong URL of "Iceberg API" on Spark user docs URL: https://github.com/apache/incubator-iceberg/issues/394 On the page of https://iceberg.incubator.apache.org/spark/, in "Spark 2.4 is limited to reading and writing existing Iceberg tables. Use the Iceberg

[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314584147 ## File path: api/src/main/java/org/apache/iceberg/expressions/ExpressionVisitors.java ## @@ -89,12 +93,12 @@

[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314583886 ## File path: api/src/main/java/org/apache/iceberg/expressions/LiteralSet.java ## @@ -0,0 +1,212 @@ +/* + *

[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314583647 ## File path: api/src/main/java/org/apache/iceberg/expressions/LiteralSet.java ## @@ -0,0 +1,212 @@ +/* + *

[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314585257 ## File path: api/src/main/java/org/apache/iceberg/expressions/LiteralSet.java ## @@ -0,0 +1,212 @@ +/* + *

[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314585643 ## File path: api/src/main/java/org/apache/iceberg/expressions/ManifestEvaluator.java ## @@ -245,12 +245,12 @@

[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314586632 ## File path: api/src/main/java/org/apache/iceberg/expressions/Expressions.java ## @@ -109,16 +111,46 @@ public

[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314589849 ## File path: api/src/main/java/org/apache/iceberg/expressions/UnboundPredicate.java ## @@ -125,13 +154,38 @@

[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314589687 ## File path: api/src/main/java/org/apache/iceberg/expressions/Predicate.java ## @@ -19,15 +19,49 @@ package

[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314586632 ## File path: api/src/main/java/org/apache/iceberg/expressions/Expressions.java ## @@ -109,16 +111,46 @@ public

[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314585849 ## File path: parquet/src/main/java/org/apache/iceberg/parquet/ParquetDictionaryRowGroupFilter.java ## @@

[GitHub] [incubator-iceberg] jun-he commented on a change in pull request #357: Add in and not in predicates

2019-08-15 Thread GitBox
jun-he commented on a change in pull request #357: Add in and not in predicates URL: https://github.com/apache/incubator-iceberg/pull/357#discussion_r314588864 ## File path: api/src/main/java/org/apache/iceberg/expressions/UnboundPredicate.java ## @@ -125,13 +154,38 @@