Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-05-16 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1602698512 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/BatchReadConf.java: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-05-08 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1594829408 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/BatchReadConf.java: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-05-03 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1589730054 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/CometColumnReader.java: ## @@ -0,0 +1,164 @@ +/* + * Licensed to the Apache Software

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-05-03 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1589715889 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkReadConf.java: ## @@ -184,7 +185,7 @@ public boolean orcVectorizationEnabled() { .parse();

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-05-03 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1589703297 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkSQLProperties.java: ## @@ -27,6 +27,10 @@ private SparkSQLProperties() {} // Controls whether

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-30 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1585770312 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/source/SparkColumnarReaderFactory.java: ## @@ -28,10 +29,12 @@ class SparkColumnarReaderFactory

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-30 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1585769857 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/source/SparkBatch.java: ## @@ -115,11 +115,11 @@ private String[][] computePreferredLocations() {

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-30 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1585769554 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/source/BaseBatchReader.java: ## @@ -32,23 +32,27 @@ import org.apache.iceberg.orc.ORC; import

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-30 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1585769266 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/comet/CometColumnReader.java: ## @@ -0,0 +1,163 @@ +/* + * Licensed to the Apache Software

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-30 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1585767596 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/ColumnarBatchReader.java: ## @@ -74,48 +71,23 @@ public final ColumnarBatch

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-30 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1585767204 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/ParquetReaderType.java: ## @@ -0,0 +1,24 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-30 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1585438330 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/source/BaseBatchReader.java: ## @@ -32,23 +32,27 @@ import org.apache.iceberg.orc.ORC; import

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-30 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1585438330 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/source/BaseBatchReader.java: ## @@ -32,23 +32,27 @@ import org.apache.iceberg.orc.ORC; import

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-30 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1585438330 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/source/BaseBatchReader.java: ## @@ -32,23 +32,27 @@ import org.apache.iceberg.orc.ORC; import

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-30 Thread via GitHub
aokolnychyi commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-2086131659 Will check today. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-29 Thread via GitHub
huaxingao commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-2083223651 @aokolnychyi I have addressed the comments. Could you please take one more look when you have a moment? Thanks a lot! -- This is an automated message from the Apache Git Service. To

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-26 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1580539378 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/source/SparkColumnarReaderFactory.java: ## @@ -49,7 +52,9 @@ public PartitionReader

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-26 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1580538817 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/VectorizedSparkParquetReaders.java: ## @@ -51,22 +53,37 @@ public class

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-26 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1580538480 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkReadConf.java: ## @@ -353,4 +354,12 @@ private boolean executorCacheLocalityEnabledInternal() {

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-26 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1580538348 ## gradle.properties: ## @@ -20,8 +20,8 @@ systemProp.defaultFlinkVersions=1.18 systemProp.knownFlinkVersions=1.16,1.17,1.18 systemProp.defaultHiveVersions=2

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-26 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1580538128 ## api/src/main/java/org/apache/iceberg/ReaderType.java: ## @@ -0,0 +1,24 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-22 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1575419871 ## gradle.properties: ## @@ -20,8 +20,8 @@ systemProp.defaultFlinkVersions=1.18 systemProp.knownFlinkVersions=1.16,1.17,1.18 systemProp.defaultHiveVersions=2

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-22 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1575416837 ## api/src/main/java/org/apache/iceberg/ReaderType.java: ## @@ -0,0 +1,24 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-22 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1575528604 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkConfParser.java: ## @@ -196,6 +201,33 @@ private Duration toDuration(String time) { } } +

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-22 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1575528604 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkConfParser.java: ## @@ -196,6 +201,33 @@ private Duration toDuration(String time) { } } +

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-21 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1573865006 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/BaseColumnBatchLoader.java: ## @@ -0,0 +1,199 @@ +/* + * Licensed to the Apache Software

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-21 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1573864929 ## spark/v3.4/build.gradle: ## @@ -70,8 +70,11 @@ project(":iceberg-spark:iceberg-spark-${sparkMajorVersion}_${scalaVersion}") { exclude group: 'io.netty',

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-18 Thread via GitHub
RussellSpitzer commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1571301509 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/comet/CometColumnReader.java: ## @@ -0,0 +1,165 @@ +/* + * Licensed to the Apache

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-18 Thread via GitHub
RussellSpitzer commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1571295057 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/BaseColumnBatchLoader.java: ## @@ -0,0 +1,199 @@ +/* + * Licensed to the Apache

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-18 Thread via GitHub
RussellSpitzer commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1571293170 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/BaseColumnBatchLoader.java: ## @@ -0,0 +1,199 @@ +/* + * Licensed to the Apache

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-18 Thread via GitHub
RussellSpitzer commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1571286633 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkConfParser.java: ## @@ -196,6 +201,40 @@ private Duration toDuration(String time) { } }

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-18 Thread via GitHub
RussellSpitzer commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1571283635 ## spark/v3.4/build.gradle: ## @@ -70,8 +70,11 @@ project(":iceberg-spark:iceberg-spark-${sparkMajorVersion}_${scalaVersion}") { exclude group:

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-16 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1568005309 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/comet/CometIcebergColumnarBatchReader.java: ## @@ -0,0 +1,303 @@ +/* + * Licensed to the

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-16 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1566753916 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/VectorizedSparkParquetReaders.java: ## @@ -51,22 +53,43 @@ public class

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-16 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1566753486 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/comet/CometIcebergColumnarBatchReader.java: ## @@ -0,0 +1,303 @@ +/* + * Licensed to the

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-16 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1566752772 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/comet/CometIcebergColumnReader.java: ## @@ -0,0 +1,164 @@ +/* + * Licensed to the Apache

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-16 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1566752634 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/comet/CometIcebergColumnReader.java: ## @@ -0,0 +1,164 @@ +/* + * Licensed to the Apache

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-16 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1566753162 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/comet/CometIcebergColumnarBatchReader.java: ## @@ -0,0 +1,303 @@ +/* + * Licensed to the

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-15 Thread via GitHub
aokolnychyi commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1566673808 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/comet/CometIcebergColumnReader.java: ## @@ -0,0 +1,164 @@ +/* + * Licensed to the Apache

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-02 Thread via GitHub
huaxingao commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1548708375 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/comet/CometIcebergColumnReader.java: ## @@ -0,0 +1,164 @@ +/* + * Licensed to the Apache

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-04-02 Thread via GitHub
RussellSpitzer commented on code in PR #9841: URL: https://github.com/apache/iceberg/pull/9841#discussion_r1548617271 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/comet/CometIcebergColumnReader.java: ## @@ -0,0 +1,164 @@ +/* + * Licensed to the

Re: [PR] Iceberg/Comet integration POC [iceberg]

2024-02-29 Thread via GitHub
huaxingao commented on PR #9841: URL: https://github.com/apache/iceberg/pull/9841#issuecomment-1972363257 cc @aokolnychyi @sunchao -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[PR] Iceberg/Comet integration POC [iceberg]

2024-02-29 Thread via GitHub
huaxingao opened a new pull request, #9841: URL: https://github.com/apache/iceberg/pull/9841 This PR shows how I will integrate [Comet](https://github.com/apache/arrow-datafusion-comet) with iceberg. The PR doesn't compile yet because we haven't released Comet yet. Also, Comet doesn't