[GitHub] [iceberg] findepi commented on pull request #6474: Make it explicit that metrics reporter is required

2023-01-20 Thread GitBox
findepi commented on PR #6474: URL: https://github.com/apache/iceberg/pull/6474#issuecomment-1398101458 thanks for the merge! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [iceberg] JanKaul commented on issue #6420: Iceberg Materialized View Spec

2023-01-20 Thread GitBox
JanKaul commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1398113386 Yes, I agree with the proposed design 1. I'm not entirely sure what @rdblue prefers. I will update the Google doc accordingly. The next question for me is where and how

[GitHub] [iceberg] kingeasternsun commented on pull request #6624: 🎨 Add "parallelism" parameter to "add_files" syscall and MigrateTable, SnapshotTable.

2023-01-20 Thread GitBox
kingeasternsun commented on PR #6624: URL: https://github.com/apache/iceberg/pull/6624#issuecomment-1398154903 > Left a review, thanks for the contribution @kingeasternsun ! Also looks like spotless checks are failing which you can fix by running `./gradlew :iceberg-api:spotlessJavaCheck`

[GitHub] [iceberg] cgpoh opened a new issue, #6630: Purpose of MAX_CONTINUOUS_EMPTY_COMMITS in IcebergFilesCommitter

2023-01-20 Thread GitBox
cgpoh opened a new issue, #6630: URL: https://github.com/apache/iceberg/issues/6630 ### Query engine Flink ### Question I have a Flink job that uses side output to write to Iceberg table when there are errors in the main processing function. If there are no errors in the

[GitHub] [iceberg] ajantha-bhat commented on pull request #6628: Nessie: Bump to 0.47.0

2023-01-20 Thread GitBox
ajantha-bhat commented on PR #6628: URL: https://github.com/apache/iceberg/pull/6628#issuecomment-1398211777 I think we can bump it to `0.47.1` now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [iceberg] mriveraFacephi commented on issue #2040: Partial data ingestion to Iceberg in failing with Spark 3.0.x

2023-01-20 Thread GitBox
mriveraFacephi commented on issue #2040: URL: https://github.com/apache/iceberg/issues/2040#issuecomment-1398240262 Same problem here with Spark 3.1.1 and Iceberg 0.13.1. I'm trying to write dataframe by using the Spark v2 API command writeTo. Every column in my schema is nullable. In my

[GitHub] [iceberg] gaborkaszab commented on issue #6257: Partitions metadata table shows old partitions

2023-01-20 Thread GitBox
gaborkaszab commented on issue #6257: URL: https://github.com/apache/iceberg/issues/6257#issuecomment-1398396353 > What would the algorithm be? If the partition has delete files, try to do a full MOR, and check if records are null? Personally, sounds a bit extreme, I would think a good firs

[GitHub] [iceberg] Fokko merged pull request #6628: Nessie: Bump to 0.47.0

2023-01-20 Thread GitBox
Fokko merged PR #6628: URL: https://github.com/apache/iceberg/pull/6628 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

[GitHub] [iceberg] jackye1995 commented on issue #6420: Iceberg Materialized View Spec

2023-01-20 Thread GitBox
jackye1995 commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1398572156 I would +1 on storing in snapshot summary, because: 1. snapshot corresponds very well to MV refresh, there is a 1:1 relationship between them. 2. table properties is not vers

[GitHub] [iceberg] jackye1995 commented on issue #6625: Improve nullability check in Iceberg codebase

2023-01-20 Thread GitBox
jackye1995 commented on issue #6625: URL: https://github.com/apache/iceberg/issues/6625#issuecomment-1398577729 @nastra any thoughts? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [iceberg] nastra commented on issue #6625: Improve nullability check in Iceberg codebase

2023-01-20 Thread GitBox
nastra commented on issue #6625: URL: https://github.com/apache/iceberg/issues/6625#issuecomment-1398623955 I also like annotations like `@Nullable` to indicate that certain things in the API can be nullable as this makes it easier to consume that particular API and reason about it. May

[GitHub] [iceberg] jackye1995 commented on issue #6625: Improve nullability check in Iceberg codebase

2023-01-20 Thread GitBox
jackye1995 commented on issue #6625: URL: https://github.com/apache/iceberg/issues/6625#issuecomment-1398630489 > should we maybe raise this discussion topic on the mailing list in order to increase visibility for people? Yes agree, let's do that so we can reach a consensus and procee

[GitHub] [iceberg] stevenzwu merged pull request #6584: Flink: support reading as Avro GenericRecord for FLIP-27 IcebergSource

2023-01-20 Thread GitBox
stevenzwu merged PR #6584: URL: https://github.com/apache/iceberg/pull/6584 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6584: Flink: support reading as Avro GenericRecord for FLIP-27 IcebergSource

2023-01-20 Thread GitBox
stevenzwu commented on code in PR #6584: URL: https://github.com/apache/iceberg/pull/6584#discussion_r1082782121 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/source/reader/AvroGenericRecordReaderFunction.java: ## @@ -0,0 +1,98 @@ +/* + * Licensed to the Apache Sof

[GitHub] [iceberg] stevenzwu opened a new pull request, #6631: Flink: backport PR #6584 to 1.14 and 1.15 for Avro GenericRecord in FLIP-27 source

2023-01-20 Thread GitBox
stevenzwu opened a new pull request, #6631: URL: https://github.com/apache/iceberg/pull/6631 I also piggybacked the fix of package name (a mishap from PR #6584). some classes should be in the `flink/source/reader` packages. -- This is an automated message from the Apache Git Service. To r

[GitHub] [iceberg] stevenzwu commented on pull request #6631: Flink: backport PR #6584 to 1.14 and 1.15 for Avro GenericRecord in FLIP-27 source

2023-01-20 Thread GitBox
stevenzwu commented on PR #6631: URL: https://github.com/apache/iceberg/pull/6631#issuecomment-1398722055 I checked the following diff and found nothing related to the classes touched by PR #6584 ``` git diff --no-index flink/v1.14/flink/src/ flink/v1.16/flink/src git diff -

[GitHub] [iceberg] amogh-jahagirdar opened a new issue, #6632: Bug with Branch Transactions

2023-01-20 Thread GitBox
amogh-jahagirdar opened a new issue, #6632: URL: https://github.com/apache/iceberg/issues/6632 ### Apache Iceberg version 1.1.0 (latest release) ### Query engine None ### Please describe the bug 🐞 Creating this issue for awareness, was discussing with @rdblu

[GitHub] [iceberg] amogh-jahagirdar commented on issue #6632: Bug with Branch Transactions

2023-01-20 Thread GitBox
amogh-jahagirdar commented on issue #6632: URL: https://github.com/apache/iceberg/issues/6632#issuecomment-1398725654 I'm working on a fix for this @jackye1995 could you assign this to me? -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6631: Flink: backport PR #6584 to 1.14 and 1.15 for Avro GenericRecord in FLIP-27 source

2023-01-20 Thread GitBox
stevenzwu commented on code in PR #6631: URL: https://github.com/apache/iceberg/pull/6631#discussion_r1082878466 ## flink/v1.16/flink/src/test/java/org/apache/iceberg/flink/source/TestRowDataToAvroGenericRecordConverter.java: ## @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache So

[GitHub] [iceberg] jackye1995 commented on issue #6632: Bug with Branch Transactions

2023-01-20 Thread GitBox
jackye1995 commented on issue #6632: URL: https://github.com/apache/iceberg/issues/6632#issuecomment-1398734413 Sure, assigned! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[GitHub] [iceberg] nastra commented on a diff in pull request #6065: Fix TestAggregateBinding

2022-10-26 Thread GitBox
nastra commented on code in PR #6065: URL: https://github.com/apache/iceberg/pull/6065#discussion_r1006440948 ## api/src/test/java/org/apache/iceberg/expressions/TestAggregateBinding.java: ## @@ -18,44 +18,28 @@ */ package org.apache.iceberg.expressions; -import java.util.A

[GitHub] [iceberg] nastra commented on pull request #6062: Format: Geometry-support first pass

2022-10-27 Thread GitBox
nastra commented on PR #6062: URL: https://github.com/apache/iceberg/pull/6062#issuecomment-1293089668 @thomafred thanks for your contribution. In order to raise visibility and to gather general feedback I think it would be great if you could bring up this topic on the [Dev Mailing list](d.

[GitHub] [iceberg] thomafred commented on pull request #6062: Format: Geometry-support first pass

2022-10-27 Thread GitBox
thomafred commented on PR #6062: URL: https://github.com/apache/iceberg/pull/6062#issuecomment-1293091104 @nastra Thank you for your feedback. I will bring this up in the DEV mailing list as you suggest :) -- This is an automated message from the Apache Git Service. To respond to the mess

[GitHub] [iceberg] thomafred commented on pull request #6062: Format: Geometry-support first pass

2022-10-27 Thread GitBox
thomafred commented on PR #6062: URL: https://github.com/apache/iceberg/pull/6062#issuecomment-1293113368 Link to mailing list thread: https://lists.apache.org/thread/t6nk5t5j1p302hmcjs77lndm07ssk8cl -- This is an automated message from the Apache Git Service. To respond to the message, p

[GitHub] [iceberg] eyeryone opened a new issue, #6067: exec insert into (hive on spark),no erro log,but table no data

2022-10-27 Thread GitBox
eyeryone opened a new issue, #6067: URL: https://github.com/apache/iceberg/issues/6067 ### Apache Iceberg version 0.14.0 ### Query engine Spark ### Please describe the bug 🐞 exec insert into (hive on spark),no erro log,but table no data. if use `set hive.e

[GitHub] [iceberg] eyeryone commented on issue #6067: exec insert into (hive on spark),no erro log,but table no data

2022-10-27 Thread GitBox
eyeryone commented on issue #6067: URL: https://github.com/apache/iceberg/issues/6067#issuecomment-1293262566 env:cdh6.3.2 hive 2.1.1 spark 2.4 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[GitHub] [iceberg] eyeryone commented on issue #6067: exec insert into (hive on spark),no erro log,but table no data

2022-10-27 Thread GitBox
eyeryone commented on issue #6067: URL: https://github.com/apache/iceberg/issues/6067#issuecomment-1293265262 It is normal to insert other types of tables(hive on spark) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [iceberg] Jonathan-Rosenberg opened a new pull request, #6068: Docs: Fix link in the Java Custom Catalog page

2022-10-27 Thread GitBox
Jonathan-Rosenberg opened a new pull request, #6068: URL: https://github.com/apache/iceberg/pull/6068 Currently, the link to the custom catalog class implementation is referring to the heading of the page. This PR fixes that behavior. -- This is an automated message from the Apache Git S

[GitHub] [iceberg] rdblue commented on a diff in pull request #6058: Core,Spark: Add metadata to Scan Report

2022-10-27 Thread GitBox
rdblue commented on code in PR #6058: URL: https://github.com/apache/iceberg/pull/6058#discussion_r1007067162 ## core/src/main/java/org/apache/iceberg/BaseTableScan.java: ## @@ -135,11 +143,14 @@ public CloseableIterable planFiles() { planningDuration.stop();

[GitHub] [iceberg] nastra commented on a diff in pull request #6058: Core,Spark: Add metadata to Scan Report

2022-10-27 Thread GitBox
nastra commented on code in PR #6058: URL: https://github.com/apache/iceberg/pull/6058#discussion_r1007091840 ## core/src/main/java/org/apache/iceberg/BaseTableScan.java: ## @@ -135,11 +143,14 @@ public CloseableIterable planFiles() { planningDuration.stop();

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #2276: Core: Add option to combine tasks by partition

2022-10-27 Thread GitBox
aokolnychyi commented on code in PR #2276: URL: https://github.com/apache/iceberg/pull/2276#discussion_r1007098070 ## api/src/main/java/org/apache/iceberg/Scan.java: ## @@ -129,6 +129,34 @@ default ThisT select(String... columns) { */ ThisT planWith(ExecutorService execut

[GitHub] [iceberg] nastra commented on a diff in pull request #6058: Core,Spark: Add metadata to Scan Report

2022-10-27 Thread GitBox
nastra commented on code in PR #6058: URL: https://github.com/apache/iceberg/pull/6058#discussion_r1007091840 ## core/src/main/java/org/apache/iceberg/BaseTableScan.java: ## @@ -135,11 +143,14 @@ public CloseableIterable planFiles() { planningDuration.stop();

[GitHub] [iceberg] aokolnychyi commented on pull request #2276: Core: Add option to combine tasks by partition

2022-10-27 Thread GitBox
aokolnychyi commented on PR #2276: URL: https://github.com/apache/iceberg/pull/2276#issuecomment-1293765177 @sunchao, I think we were right during the earlier discussion. If there are multiple specs, the partition expressions we report in `KeyGroupPartitioning` should be an intersection of

[GitHub] [iceberg] sachinag commented on pull request #1972: First version of the Iceberg sink for Apache Beam

2022-10-27 Thread GitBox
sachinag commented on PR #1972: URL: https://github.com/apache/iceberg/pull/1972#issuecomment-1293768654 To the extent we on the Beam team can help, please let me (us) know. We'd love to see this completed and merged. -- This is an automated message from the Apache Git Service. To respon

[GitHub] [iceberg] stevenzwu commented on issue #6066: flink restore failed with filenotfound

2022-10-27 Thread GitBox
stevenzwu commented on issue #6066: URL: https://github.com/apache/iceberg/issues/6066#issuecomment-1293769967 `.avro` might be a manifest file. do you have the complete stack trace? Which Flink version? I couldn't find this log line in 1.13 (or 1.14 and 1.15). ``` 2022-10-21

[GitHub] [iceberg] flyrain merged pull request #6046: Spark 3.1: Ensure rowStartPosInBatch in ColumnarBatchReader is set correctly

2022-10-27 Thread GitBox
flyrain merged PR #6046: URL: https://github.com/apache/iceberg/pull/6046 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

[GitHub] [iceberg] nastra commented on a diff in pull request #6068: Docs: Fix link in the Java Custom Catalog page

2022-10-27 Thread GitBox
nastra commented on code in PR #6068: URL: https://github.com/apache/iceberg/pull/6068#discussion_r1007172686 ## docs/java-custom-catalog.md: ## @@ -30,7 +30,7 @@ menu: It's possible to read an iceberg table either from an hdfs path or from a hive table. It's also possible to

[GitHub] [iceberg] ahshahid commented on issue #6039: Spark : Perf enhancement by leveraging Dynamic Partition Pruning rule of spark for non partition columns used as join condition

2022-10-27 Thread GitBox
ahshahid commented on issue #6039: URL: https://github.com/apache/iceberg/issues/6039#issuecomment-1293878662 I have a PR locally ready, doing some perf testing & general testing... will open an upstream PR soon. -- This is an automated message from the Apache Git Service. To respond to t

[GitHub] [iceberg] dhruv-pratap opened a new pull request, #6069: Python: TableScan Plan files API implementation without row-level filters evaluation.

2022-10-27 Thread GitBox
dhruv-pratap opened a new pull request, #6069: URL: https://github.com/apache/iceberg/pull/6069 Taking a first dig at TableScan Plan Files API. The implementation at present evaluates partition filters but does not evaluate row-level filters at the moment as it requires Projections.

[GitHub] [iceberg] joshuarobinson opened a new pull request, #6070: PyArrow should convert timestamps to microseconds.

2022-10-27 Thread GitBox
joshuarobinson opened a new pull request, #6070: URL: https://github.com/apache/iceberg/pull/6070 The Iceberg spec keeps timestamp in microsecond format so we should convert to a PyArrow type that doesn't lose precision. The [Iceberg spec](https://iceberg.apache.org/docs/latest/schema

[GitHub] [iceberg] aokolnychyi commented on pull request #2276: Core: Add option to combine tasks by partition

2022-10-27 Thread GitBox
aokolnychyi commented on PR #2276: URL: https://github.com/apache/iceberg/pull/2276#issuecomment-1294031962 Let's think through the algorithm given what we discussed. The main problem is that we don't know what spec IDs are affected by a scan until we plan files. I think the following would

[GitHub] [iceberg] barronw closed issue #6018: What is the expected behavior of expireOlderThan for a table with a tag that has not reached max age?

2022-10-27 Thread GitBox
barronw closed issue #6018: What is the expected behavior of expireOlderThan for a table with a tag that has not reached max age? URL: https://github.com/apache/iceberg/issues/6018 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [iceberg] rdblue merged pull request #6047: Core: Replace projected Schema with schemaId/fieldIds/fieldNames in ScanReport

2022-10-27 Thread GitBox
rdblue merged PR #6047: URL: https://github.com/apache/iceberg/pull/6047 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

[GitHub] [iceberg] rdblue commented on pull request #6047: Core: Replace projected Schema with schemaId/fieldIds/fieldNames in ScanReport

2022-10-27 Thread GitBox
rdblue commented on PR #6047: URL: https://github.com/apache/iceberg/pull/6047#issuecomment-1294141919 Looks good. Thanks, @nastra! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [iceberg] C-h-e-r-r-y commented on issue #5867: Facing error when creating iceberg table in EMR using Glue catalog

2022-10-27 Thread GitBox
C-h-e-r-r-y commented on issue #5867: URL: https://github.com/apache/iceberg/issues/5867#issuecomment-1294143670 > can try setting glue.skip-name-validation via catalog properties if you wanna skip these validations : It is very hard to figure out how to set these propertes. Could you

[GitHub] [iceberg] rdblue commented on a diff in pull request #6058: Core,Spark: Add metadata to Scan Report

2022-10-27 Thread GitBox
rdblue commented on code in PR #6058: URL: https://github.com/apache/iceberg/pull/6058#discussion_r1007419182 ## core/src/main/java/org/apache/iceberg/BaseTableScan.java: ## @@ -135,11 +143,14 @@ public CloseableIterable planFiles() { planningDuration.stop();

[GitHub] [iceberg] rdblue commented on pull request #6058: Core,Spark: Add metadata to Scan Report

2022-10-27 Thread GitBox
rdblue commented on PR #6058: URL: https://github.com/apache/iceberg/pull/6058#issuecomment-1294146937 Overall this looks good, just needs to be rebased now that #6047 is in. The name `metadata` seems okay. I can't come up with a better one. -- This is an automated message from the Apache

[GitHub] [iceberg] flyrain commented on a diff in pull request #6063: Spark 3.2: #6041 follow-up/cleanup

2022-10-27 Thread GitBox
flyrain commented on code in PR #6063: URL: https://github.com/apache/iceberg/pull/6063#discussion_r1007421071 ## spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkReaderDeletes.java: ## @@ -533,55 +534,52 @@ public void testIsDeletedColumnWithoutDeleteFile

[GitHub] [iceberg] flyrain commented on a diff in pull request #6063: Spark 3.2: #6041 follow-up/cleanup

2022-10-27 Thread GitBox
flyrain commented on code in PR #6063: URL: https://github.com/apache/iceberg/pull/6063#discussion_r1007421071 ## spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkReaderDeletes.java: ## @@ -533,55 +534,52 @@ public void testIsDeletedColumnWithoutDeleteFile

[GitHub] [iceberg] wypoon commented on issue #5927: Vectorized reading of parquet in an updated table with 'merge-on-read' returns wrong results

2022-10-27 Thread GitBox
wypoon commented on issue #5927: URL: https://github.com/apache/iceberg/issues/5927#issuecomment-1294191247 @rbalamohan this is fixed by #6026 (which is also ported to Spark 3.2 and 3.1). I'm not able to resolve this issue. Can you do it? -- This is an automated message from the Apache

[GitHub] [iceberg] wypoon commented on a diff in pull request #6063: Spark 3.2: #6041 follow-up/cleanup

2022-10-27 Thread GitBox
wypoon commented on code in PR #6063: URL: https://github.com/apache/iceberg/pull/6063#discussion_r1007475490 ## spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkReaderDeletes.java: ## @@ -533,55 +534,52 @@ public void testIsDeletedColumnWithoutDeleteFile(

[GitHub] [iceberg] chenwyi2 commented on issue #6066: flink restore failed with filenotfound

2022-10-27 Thread GitBox
chenwyi2 commented on issue #6066: URL: https://github.com/apache/iceberg/issues/6066#issuecomment-1294294691 > `.avro` might be a manifest file. do you have the complete stack trace? Which Flink version? > > I couldn't find this log line in 1.13 (or 1.14 and 1.15). > > ```

[GitHub] [iceberg] rbalamohan commented on issue #5927: Vectorized reading of parquet in an updated table with 'merge-on-read' returns wrong results

2022-10-27 Thread GitBox
rbalamohan commented on issue #5927: URL: https://github.com/apache/iceberg/issues/5927#issuecomment-1294363557 Thanks @wypoon . Closing this ticket. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [iceberg] rbalamohan closed issue #5927: Vectorized reading of parquet in an updated table with 'merge-on-read' returns wrong results

2022-10-27 Thread GitBox
rbalamohan closed issue #5927: Vectorized reading of parquet in an updated table with 'merge-on-read' returns wrong results URL: https://github.com/apache/iceberg/issues/5927 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[GitHub] [iceberg] huaxingao commented on a diff in pull request #6065: Fix TestAggregateBinding

2022-10-27 Thread GitBox
huaxingao commented on code in PR #6065: URL: https://github.com/apache/iceberg/pull/6065#discussion_r1007578132 ## api/src/test/java/org/apache/iceberg/expressions/TestAggregateBinding.java: ## @@ -18,44 +18,28 @@ */ package org.apache.iceberg.expressions; -import java.uti

[GitHub] [iceberg] stevenzwu commented on issue #6066: flink restore failed with filenotfound

2022-10-27 Thread GitBox
stevenzwu commented on issue #6066: URL: https://github.com/apache/iceberg/issues/6066#issuecomment-1294422350 > Start to flush snapshot state to state backend This happens in `IcebergFilesCommitter#snapshotState`. if checkpoint N didn't complete successfully, the written manifest fi

[GitHub] [iceberg] lirui-apache opened a new issue, #6071: Should ClientPool consider UGI when reusing a connection?

2022-10-27 Thread GitBox
lirui-apache opened a new issue, #6071: URL: https://github.com/apache/iceberg/issues/6071 ### Query engine Spark ### Question I have a question regarding the ClientPool in HiveCatalog. It seems we don’t consider UGI when reusing a client, does this mean we could perform

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6068: Docs: Fix link in the Java Custom Catalog page

2022-10-27 Thread GitBox
ajantha-bhat commented on code in PR #6068: URL: https://github.com/apache/iceberg/pull/6068#discussion_r1007624158 ## docs/java-custom-catalog.md: ## @@ -30,7 +30,7 @@ menu: It's possible to read an iceberg table either from an hdfs path or from a hive table. It's also possib

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6068: Docs: Fix link in the Java Custom Catalog page

2022-10-27 Thread GitBox
ajantha-bhat commented on code in PR #6068: URL: https://github.com/apache/iceberg/pull/6068#discussion_r1007627943 ## docs/java-custom-catalog.md: ## @@ -30,7 +30,7 @@ menu: It's possible to read an iceberg table either from an hdfs path or from a hive table. It's also possib

[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6059: Core,Spark: Fix raw generics usage of ManifestWriter

2022-10-27 Thread GitBox
ajantha-bhat commented on code in PR #6059: URL: https://github.com/apache/iceberg/pull/6059#discussion_r1007639523 ## spark/v3.3/spark/src/test/java/org/apache/iceberg/TestManifestFileSerialization.java: ## @@ -199,7 +199,7 @@ private ManifestFile writeManifest(DataFile... file

[GitHub] [iceberg] nastra commented on a diff in pull request #6058: Core,Spark: Add metadata to Scan Report

2022-10-27 Thread GitBox
nastra commented on code in PR #6058: URL: https://github.com/apache/iceberg/pull/6058#discussion_r1007658388 ## core/src/main/java/org/apache/iceberg/BaseTableScan.java: ## @@ -135,11 +143,14 @@ public CloseableIterable planFiles() { planningDuration.stop();

[GitHub] [iceberg] nastra commented on pull request #6058: Core,Spark: Add metadata to Scan Report

2022-10-27 Thread GitBox
nastra commented on PR #6058: URL: https://github.com/apache/iceberg/pull/6058#issuecomment-1294492795 @rdblue I've rebased the PR -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

[GitHub] [iceberg] nastra commented on a diff in pull request #6068: Docs: Fix link in the Java Custom Catalog page

2022-10-27 Thread GitBox
nastra commented on code in PR #6068: URL: https://github.com/apache/iceberg/pull/6068#discussion_r1007661613 ## docs/java-custom-catalog.md: ## @@ -30,7 +30,7 @@ menu: It's possible to read an iceberg table either from an hdfs path or from a hive table. It's also possible to

[GitHub] [iceberg] nastra commented on a diff in pull request #6059: Core,Spark: Fix raw generics usage of ManifestWriter

2022-10-27 Thread GitBox
nastra commented on code in PR #6059: URL: https://github.com/apache/iceberg/pull/6059#discussion_r1007695142 ## spark/v3.3/spark/src/test/java/org/apache/iceberg/TestManifestFileSerialization.java: ## @@ -199,7 +199,7 @@ private ManifestFile writeManifest(DataFile... files) th

[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #5632: Core: Avoid reading ManifestFile when create ManifestReader

2022-10-28 Thread GitBox
ConeyLiu commented on code in PR #5632: URL: https://github.com/apache/iceberg/pull/5632#discussion_r1007807328 ## core/src/main/java/org/apache/iceberg/ManifestReader.java: ## @@ -126,15 +138,12 @@ protected ManifestReader( specId = Integer.parseInt(specProperty); }

[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #5659: Core: Support set system level properties with environmental variables

2022-10-28 Thread GitBox
ConeyLiu commented on code in PR #5659: URL: https://github.com/apache/iceberg/pull/5659#discussion_r1007825452 ## core/src/main/java/org/apache/iceberg/SystemProperties.java: ## @@ -30,14 +32,23 @@ private SystemProperties() {} */ public static final String WORKER_THREAD

[GitHub] [iceberg] ConeyLiu commented on issue #4576: Read metadata table failed due to illegal character

2022-10-28 Thread GitBox
ConeyLiu commented on issue #4576: URL: https://github.com/apache/iceberg/issues/4576#issuecomment-1294801011 @rdblue @nastra Would you mind taking a look at this issue? This is true should be a bug. -- This is an automated message from the Apache Git Service. To respond to the message, p

[GitHub] [iceberg] nastra commented on issue #6071: Should ClientPool consider UGI when reusing a connection?

2022-10-28 Thread GitBox
nastra commented on issue #6071: URL: https://github.com/apache/iceberg/issues/6071#issuecomment-1294818143 @pvary is this something that you could potentially help answering? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [iceberg] Jonathan-Rosenberg commented on a diff in pull request #6068: Docs: Fix link in the Java Custom Catalog page

2022-10-28 Thread GitBox
Jonathan-Rosenberg commented on code in PR #6068: URL: https://github.com/apache/iceberg/pull/6068#discussion_r1007972402 ## docs/java-custom-catalog.md: ## @@ -30,7 +30,7 @@ menu: It's possible to read an iceberg table either from an hdfs path or from a hive table. It's also

[GitHub] [iceberg] nastra commented on a diff in pull request #6068: Docs: Fix link in the Java Custom Catalog page

2022-10-28 Thread GitBox
nastra commented on code in PR #6068: URL: https://github.com/apache/iceberg/pull/6068#discussion_r1007974447 ## docs/java-custom-catalog.md: ## @@ -30,7 +30,7 @@ menu: It's possible to read an iceberg table either from an hdfs path or from a hive table. It's also possible to

[GitHub] [iceberg] Jonathan-Rosenberg commented on a diff in pull request #6068: Docs: Fix link in the Java Custom Catalog page

2022-10-28 Thread GitBox
Jonathan-Rosenberg commented on code in PR #6068: URL: https://github.com/apache/iceberg/pull/6068#discussion_r1007977385 ## docs/java-custom-catalog.md: ## @@ -30,7 +30,7 @@ menu: It's possible to read an iceberg table either from an hdfs path or from a hive table. It's also

[GitHub] [iceberg] Jonathan-Rosenberg commented on a diff in pull request #6068: Docs: Fix link in the Java Custom Catalog page

2022-10-28 Thread GitBox
Jonathan-Rosenberg commented on code in PR #6068: URL: https://github.com/apache/iceberg/pull/6068#discussion_r1007977385 ## docs/java-custom-catalog.md: ## @@ -30,7 +30,7 @@ menu: It's possible to read an iceberg table either from an hdfs path or from a hive table. It's also

[GitHub] [iceberg] nastra commented on a diff in pull request #6068: Docs: Fix link in the Java Custom Catalog page

2022-10-28 Thread GitBox
nastra commented on code in PR #6068: URL: https://github.com/apache/iceberg/pull/6068#discussion_r1008039068 ## docs/java-custom-catalog.md: ## @@ -30,7 +30,7 @@ menu: It's possible to read an iceberg table either from an hdfs path or from a hive table. It's also possible to

[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #5632: Core: Avoid reading ManifestFile when create ManifestReader

2022-10-28 Thread GitBox
ConeyLiu commented on code in PR #5632: URL: https://github.com/apache/iceberg/pull/5632#discussion_r1007807328 ## core/src/main/java/org/apache/iceberg/ManifestReader.java: ## @@ -126,15 +138,12 @@ protected ManifestReader( specId = Integer.parseInt(specProperty); }

[GitHub] [iceberg] nastra opened a new pull request, #6073: Core: Pass purgeRequested flag to REST server

2022-10-28 Thread GitBox
nastra opened a new pull request, #6073: URL: https://github.com/apache/iceberg/pull/6073 The motivation behind this change is that the REST server should know whether a `purge` was requested or not, rather than having the REST client throw an error. -- This is an automated message from

[GitHub] [iceberg] nastra commented on a diff in pull request #6073: Core: Pass purgeRequested flag to REST server

2022-10-28 Thread GitBox
nastra commented on code in PR #6073: URL: https://github.com/apache/iceberg/pull/6073#discussion_r1008114969 ## core/src/main/java/org/apache/iceberg/rest/CatalogHandlers.java: ## @@ -222,8 +222,8 @@ public static LoadTableResponse createTable( throw new IllegalStateExcept

[GitHub] [iceberg] nastra commented on a diff in pull request #6073: Core: Pass purgeRequested flag to REST server

2022-10-28 Thread GitBox
nastra commented on code in PR #6073: URL: https://github.com/apache/iceberg/pull/6073#discussion_r1008114969 ## core/src/main/java/org/apache/iceberg/rest/CatalogHandlers.java: ## @@ -222,8 +222,8 @@ public static LoadTableResponse createTable( throw new IllegalStateExcept

[GitHub] [iceberg] nastra commented on a diff in pull request #6073: Core: Pass purgeRequested flag to REST server

2022-10-28 Thread GitBox
nastra commented on code in PR #6073: URL: https://github.com/apache/iceberg/pull/6073#discussion_r1008117319 ## core/src/test/java/org/apache/iceberg/rest/RESTCatalogAdapter.java: ## @@ -320,7 +321,10 @@ public T handleRequest( case DROP_TABLE: { - C

[GitHub] [iceberg] nastra commented on a diff in pull request #6073: Core: Pass purgeRequested flag to REST server

2022-10-28 Thread GitBox
nastra commented on code in PR #6073: URL: https://github.com/apache/iceberg/pull/6073#discussion_r1008118659 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -223,8 +223,20 @@ public boolean dropTable(SessionContext context, TableIdentifier identifi

[GitHub] [iceberg] nastra commented on a diff in pull request #6073: Core: Pass purgeRequested flag to REST server

2022-10-28 Thread GitBox
nastra commented on code in PR #6073: URL: https://github.com/apache/iceberg/pull/6073#discussion_r1008121297 ## core/src/test/java/org/apache/iceberg/rest/TestHTTPClient.java: ## @@ -219,7 +219,7 @@ private static Item doExecuteRequest( restClient.head(path, headers, o

[GitHub] [iceberg] gaborkaszab opened a new pull request, #6074: API,Core: SnapshotManager to be created through Transaction

2022-10-28 Thread GitBox
gaborkaszab opened a new pull request, #6074: URL: https://github.com/apache/iceberg/pull/6074 Currently, SnapshotManager encapsulates its own BaseTransaction object and there is no way to expose it opposed to for example ExpireSnapshots that can be created through the Transaction API. As a

[GitHub] [iceberg] gaborkaszab commented on pull request #6074: API,Core: SnapshotManager to be created through Transaction

2022-10-28 Thread GitBox
gaborkaszab commented on PR #6074: URL: https://github.com/apache/iceberg/pull/6074#issuecomment-1295175523 This fixes #5882 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [iceberg] flyrain merged pull request #6063: Spark 3.2: #6041 follow-up/cleanup

2022-10-28 Thread GitBox
flyrain merged PR #6063: URL: https://github.com/apache/iceberg/pull/6063 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

[GitHub] [iceberg] flyrain commented on pull request #6063: Spark 3.2: #6041 follow-up/cleanup

2022-10-28 Thread GitBox
flyrain commented on PR #6063: URL: https://github.com/apache/iceberg/pull/6063#issuecomment-1295225620 Merged. Thanks @wypoon! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[GitHub] [iceberg] huaxingao commented on pull request #6065: Fix TestAggregateBinding

2022-10-28 Thread GitBox
huaxingao commented on PR #6065: URL: https://github.com/apache/iceberg/pull/6065#issuecomment-1295322595 @rdblue Could you please take a look when you have a moment? Thanks a lot! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

[GitHub] [iceberg] jzhuge commented on pull request #4925: API: Add view interfaces

2022-10-28 Thread GitBox
jzhuge commented on PR #4925: URL: https://github.com/apache/iceberg/pull/4925#issuecomment-1295507260 @rdblue @wmoustafa Could you take look at the PR again? Especially `ViewBuilder`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #4577: Fixes read metadata table failed due to illegal character

2022-10-28 Thread GitBox
szehon-ho commented on code in PR #4577: URL: https://github.com/apache/iceberg/pull/4577#discussion_r1008532668 ## core/src/test/java/org/apache/iceberg/TestMetadataTableScans.java: ## @@ -978,6 +1091,32 @@ private Set expectedManifestListPaths(Iterable snapshots, Long

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #5632: Core: Avoid reading ManifestFile when create ManifestReader

2022-10-28 Thread GitBox
szehon-ho commented on code in PR #5632: URL: https://github.com/apache/iceberg/pull/5632#discussion_r1008565020 ## core/src/main/java/org/apache/iceberg/ManifestReader.java: ## @@ -101,20 +100,32 @@ private String fileClass() { protected ManifestReader( InputFile fi

[GitHub] [iceberg] aokolnychyi merged pull request #5783: Build: Update Spark to 3.3.1

2022-10-28 Thread GitBox
aokolnychyi merged PR #5783: URL: https://github.com/apache/iceberg/pull/5783 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

[GitHub] [iceberg] aokolnychyi commented on pull request #5783: Build: Update Spark to 3.3.1

2022-10-28 Thread GitBox
aokolnychyi commented on PR #5783: URL: https://github.com/apache/iceberg/pull/5783#issuecomment-1295614949 Thanks, @wangyum! Thanks for reviewing, @singhpk234! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [iceberg] singhpk234 commented on issue #6044: Column pruning/projection is not happening in correlated queries (e.g Q94, Q16)

2022-10-28 Thread GitBox
singhpk234 commented on issue #6044: URL: https://github.com/apache/iceberg/issues/6044#issuecomment-1295643951 Can you please also add spark version you are observing this ? I tried this with a sample UT below with spark 3.3 I could see schema pruning happening : UT :

[GitHub] [iceberg] rbalamohan commented on issue #6044: Column pruning/projection is not happening in correlated queries (e.g Q94, Q16)

2022-10-28 Thread GitBox
rbalamohan commented on issue #6044: URL: https://github.com/apache/iceberg/issues/6044#issuecomment-1295648072 It was Spark 3.2.x Prashant. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[GitHub] [iceberg] flyrain commented on a diff in pull request #5783: Build: Update Spark to 3.3.1

2022-10-28 Thread GitBox
flyrain commented on code in PR #5783: URL: https://github.com/apache/iceberg/pull/5783#discussion_r1008594453 ## spark/v3.3/build.gradle: ## @@ -27,6 +27,16 @@ def sparkProjects = [ project(":iceberg-spark:iceberg-spark-runtime-${sparkMajorVersion}_${scalaVersion}"), ]

[GitHub] [iceberg] aokolnychyi closed pull request #6055: Spark 3.3: Use separate scan during file filtering in copy-on-write operations

2022-10-28 Thread GitBox
aokolnychyi closed pull request #6055: Spark 3.3: Use separate scan during file filtering in copy-on-write operations URL: https://github.com/apache/iceberg/pull/6055 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

[GitHub] [iceberg] luoyuxia commented on issue #3395: While I tring to select table1 join table2,if fields of table2 was choosed, error like "java.lang.ArrayIndexOutOfBoundsException: 6" occurred

2022-10-29 Thread GitBox
luoyuxia commented on issue #3395: URL: https://github.com/apache/iceberg/issues/3395#issuecomment-1295758977 It should be fixed by latest Flink. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[GitHub] [iceberg] singhpk234 commented on a diff in pull request #6034: Python: GlueCatalog Full Implementation

2022-10-29 Thread GitBox
singhpk234 commented on code in PR #6034: URL: https://github.com/apache/iceberg/pull/6034#discussion_r1008659414 ## python/pyiceberg/catalog/glue.py: ## @@ -0,0 +1,453 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements.

[GitHub] [iceberg] singhpk234 commented on a diff in pull request #6034: Python: GlueCatalog Full Implementation

2022-10-29 Thread GitBox
singhpk234 commented on code in PR #6034: URL: https://github.com/apache/iceberg/pull/6034#discussion_r1008612703 ## python/pyiceberg/catalog/glue.py: ## @@ -0,0 +1,453 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements.

[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #5632: Core: Avoid reading ManifestFile when create ManifestReader

2022-10-29 Thread GitBox
ConeyLiu commented on code in PR #5632: URL: https://github.com/apache/iceberg/pull/5632#discussion_r1008680666 ## core/src/main/java/org/apache/iceberg/ManifestReader.java: ## @@ -101,20 +100,32 @@ private String fileClass() { protected ManifestReader( InputFile fil

[GitHub] [iceberg] hililiwei opened a new pull request, #6075: Flink 1.15: Support change log scan task

2022-10-29 Thread GitBox
hililiwei opened a new pull request, #6075: URL: https://github.com/apache/iceberg/pull/6075 ## What is the purpose of the change Support for Changlog scanning in the new Flink Source(FLIP-27) It can be turned on in the following ways: ``` SELECT * FROM tableName /*+ O

[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #4577: Fixes read metadata table failed due to illegal character

2022-10-29 Thread GitBox
ConeyLiu commented on code in PR #4577: URL: https://github.com/apache/iceberg/pull/4577#discussion_r1008682913 ## core/src/main/java/org/apache/iceberg/avro/BuildAvroProjection.java: ## @@ -107,13 +107,15 @@ public Schema record(Schema record, List names, Iterable s

<    2   3   4   5   6   7   8   9   10   11   >