[GitHub] [parquet-mr] theosib-amazon commented on pull request #959: PARQUET-2126: Make cached (de)compressors thread-safe

2022-07-25 Thread GitBox
theosib-amazon commented on PR #959: URL: https://github.com/apache/parquet-mr/pull/959#issuecomment-1194158751 One option is to provide another API call that releases the cached instance for only the current thread. What should we call it? I forget whether close or release is used more,

[GitHub] [parquet-mr] steveloughran closed pull request #971: PARQUET-2134: Improve binding to ByteBufferReadable

2022-07-25 Thread GitBox
steveloughran closed pull request #971: PARQUET-2134: Improve binding to ByteBufferReadable URL: https://github.com/apache/parquet-mr/pull/971 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [parquet-mr] guillaume-fetter commented on pull request #963: PARQUET-1020 Add DynamicMessage writing support

2022-07-25 Thread GitBox
guillaume-fetter commented on PR #963: URL: https://github.com/apache/parquet-mr/pull/963#issuecomment-1193643505 Thank you very much! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [parquet-mr] shangxinli commented on pull request #980: PARQUET-2167: Fix CLI serializing footer with date fields

2022-07-24 Thread GitBox
shangxinli commented on PR #980: URL: https://github.com/apache/parquet-mr/pull/980#issuecomment-1193400394 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [parquet-mr] shangxinli commented on pull request #971: PARQUET-2134: Improve binding to ByteBufferReadable

2022-07-24 Thread GitBox
shangxinli commented on PR #971: URL: https://github.com/apache/parquet-mr/pull/971#issuecomment-1193400138 This PR is combined with https://github.com/apache/parquet-mr/pull/951. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [parquet-mr] shangxinli commented on pull request #960: Performance optimization: Move all LittleEndianDataInputStream functionality into ByteBufferInputStream

2022-07-24 Thread GitBox
shangxinli commented on PR #960: URL: https://github.com/apache/parquet-mr/pull/960#issuecomment-1193399760 @sunchao Can you have a review? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [parquet-mr] shangxinli commented on a diff in pull request #960: Performance optimization: Move all LittleEndianDataInputStream functionality into ByteBufferInputStream

2022-07-24 Thread GitBox
shangxinli commented on code in PR #960: URL: https://github.com/apache/parquet-mr/pull/960#discussion_r928315330 ## parquet-common/src/main/java/org/apache/parquet/bytes/MultiBufferInputStream.java: ## @@ -379,4 +427,120 @@ public void remove() { second.remove(); }

[GitHub] [parquet-mr] shangxinli commented on a diff in pull request #960: Performance optimization: Move all LittleEndianDataInputStream functionality into ByteBufferInputStream

2022-07-24 Thread GitBox
shangxinli commented on code in PR #960: URL: https://github.com/apache/parquet-mr/pull/960#discussion_r928315196 ## parquet-common/src/main/java/org/apache/parquet/bytes/MultiBufferInputStream.java: ## @@ -238,8 +257,31 @@ public int read(byte[] bytes, int off, int len) { }

[GitHub] [parquet-mr] shangxinli commented on a diff in pull request #960: Performance optimization: Move all LittleEndianDataInputStream functionality into ByteBufferInputStream

2022-07-24 Thread GitBox
shangxinli commented on code in PR #960: URL: https://github.com/apache/parquet-mr/pull/960#discussion_r928314985 ## parquet-common/src/main/java/org/apache/parquet/bytes/MultiBufferInputStream.java: ## @@ -238,8 +257,31 @@ public int read(byte[] bytes, int off, int len) { }

[GitHub] [parquet-mr] shangxinli commented on a diff in pull request #960: Performance optimization: Move all LittleEndianDataInputStream functionality into ByteBufferInputStream

2022-07-24 Thread GitBox
shangxinli commented on code in PR #960: URL: https://github.com/apache/parquet-mr/pull/960#discussion_r928313950 ## parquet-common/src/main/java/org/apache/parquet/bytes/ByteBufferInputStream.java: ## @@ -157,4 +165,80 @@ public void reset() throws IOException { public

[GitHub] [parquet-mr] shangxinli commented on a diff in pull request #960: Performance optimization: Move all LittleEndianDataInputStream functionality into ByteBufferInputStream

2022-07-24 Thread GitBox
shangxinli commented on code in PR #960: URL: https://github.com/apache/parquet-mr/pull/960#discussion_r928314142 ## parquet-common/src/main/java/org/apache/parquet/bytes/ByteBufferInputStream.java: ## @@ -157,4 +165,80 @@ public void reset() throws IOException { public

[GitHub] [parquet-mr] shangxinli commented on a diff in pull request #960: Performance optimization: Move all LittleEndianDataInputStream functionality into ByteBufferInputStream

2022-07-24 Thread GitBox
shangxinli commented on code in PR #960: URL: https://github.com/apache/parquet-mr/pull/960#discussion_r928313950 ## parquet-common/src/main/java/org/apache/parquet/bytes/ByteBufferInputStream.java: ## @@ -157,4 +165,80 @@ public void reset() throws IOException { public

[GitHub] [parquet-mr] shangxinli commented on a diff in pull request #960: Performance optimization: Move all LittleEndianDataInputStream functionality into ByteBufferInputStream

2022-07-24 Thread GitBox
shangxinli commented on code in PR #960: URL: https://github.com/apache/parquet-mr/pull/960#discussion_r928313950 ## parquet-common/src/main/java/org/apache/parquet/bytes/ByteBufferInputStream.java: ## @@ -157,4 +165,80 @@ public void reset() throws IOException { public

[GitHub] [parquet-mr] shangxinli commented on a diff in pull request #957: PARQUET-2069: Allow list and array record types to be compatible.

2022-07-24 Thread GitBox
shangxinli commented on code in PR #957: URL: https://github.com/apache/parquet-mr/pull/957#discussion_r928310312 ## parquet-avro/src/main/java/org/apache/parquet/avro/AvroReadSupport.java: ## @@ -136,10 +137,22 @@ public RecordMaterializer prepareForRead( GenericData

[GitHub] [parquet-mr] shangxinli commented on pull request #959: PARQUET-2126: Make cached (de)compressors thread-safe

2022-07-24 Thread GitBox
shangxinli commented on PR #959: URL: https://github.com/apache/parquet-mr/pull/959#issuecomment-1193390004 @theosib-amazon, I am not concerned if release/close isn't called and I agree the caller must call release/close after finishing. My question is that before release/close is called,

[GitHub] [parquet-mr] shangxinli commented on pull request #900: PARQUET-2042: Add support for unwrapping common Protobuf wrappers and…

2022-07-24 Thread GitBox
shangxinli commented on PR #900: URL: https://github.com/apache/parquet-mr/pull/900#issuecomment-1193386419 I think we are close to merge this PR. Resolve the conflict and use the imports , then we can merge. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [parquet-mr] shangxinli commented on a diff in pull request #900: PARQUET-2042: Add support for unwrapping common Protobuf wrappers and…

2022-07-24 Thread GitBox
shangxinli commented on code in PR #900: URL: https://github.com/apache/parquet-mr/pull/900#discussion_r928306582 ## parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoMessageConverter.java: ## @@ -427,6 +485,218 @@ public void addBinary(Binary binary) { } +

[GitHub] [parquet-mr] shangxinli commented on a diff in pull request #900: PARQUET-2042: Add support for unwrapping common Protobuf wrappers and…

2022-07-24 Thread GitBox
shangxinli commented on code in PR #900: URL: https://github.com/apache/parquet-mr/pull/900#discussion_r928306114 ## parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoSchemaConverter.java: ## @@ -97,6 +127,46 @@ public MessageType convert(Class protobufClass) {

[GitHub] [parquet-mr] shangxinli merged pull request #956: Bump hadoop-common from 2.10.1 to 3.2.3

2022-07-24 Thread GitBox
shangxinli merged PR #956: URL: https://github.com/apache/parquet-mr/pull/956 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [parquet-mr] shangxinli merged pull request #951: PARQUET-2134: Fix type checking in HadoopStreams.wrap

2022-07-24 Thread GitBox
shangxinli merged PR #951: URL: https://github.com/apache/parquet-mr/pull/951 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [parquet-mr] shangxinli commented on pull request #951: PARQUET-2134: Fix type checking in HadoopStreams.wrap

2022-07-24 Thread GitBox
shangxinli commented on PR #951: URL: https://github.com/apache/parquet-mr/pull/951#issuecomment-1193383006 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [parquet-mr] bryanck opened a new pull request, #980: PARQUET-2167: Fix CLI parsing of footer with date fields

2022-07-22 Thread GitBox
bryanck opened a new pull request, #980: URL: https://github.com/apache/parquet-mr/pull/980 This PR fixes an issue when attempting to use the CLI to view the footer of a file with date fields. The error thrown is ```com.fasterxml.jackson.databind.exc.InvalidDefinitionException: Java 8

[GitHub] [parquet-mr] jnturton commented on pull request #959: PARQUET-2126: Make cached (de)compressors thread-safe

2022-07-20 Thread GitBox
jnturton commented on PR #959: URL: https://github.com/apache/parquet-mr/pull/959#issuecomment-1191048626 > Are you concerned about leaking if release/close isn't called? I'm pretty sure that would result in leaks. I suppose that might be solvable if we added a finalize() method that

[GitHub] [parquet-mr] theosib-amazon commented on pull request #959: PARQUET-2126: Make cached (de)compressors thread-safe

2022-07-20 Thread GitBox
theosib-amazon commented on PR #959: URL: https://github.com/apache/parquet-mr/pull/959#issuecomment-1190765848 > @theosib-amazon Do you still have time for addressing the feedback? I think we are very close to merge. I'm not really sure which feedback to address. Are you concerned

[GitHub] [parquet-site] vinooganesh commented on pull request #27: Updating docsy to 2bfdac43ca13cb6605f1103581f77ba6e08a6c72

2022-07-20 Thread GitBox
vinooganesh commented on PR #27: URL: https://github.com/apache/parquet-site/pull/27#issuecomment-1190675554 cc @shangxinli one last one for you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [parquet-site] shangxinli commented on pull request #26: Updated stale link pointing to incubator url

2022-07-20 Thread GitBox
shangxinli commented on PR #26: URL: https://github.com/apache/parquet-site/pull/26#issuecomment-1190652996 Great finding. Thaks @paliwalashish ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [parquet-site] shangxinli merged pull request #26: Updated stale link pointing to incubator url

2022-07-20 Thread GitBox
shangxinli merged PR #26: URL: https://github.com/apache/parquet-site/pull/26 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [parquet-site] paliwalashish commented on pull request #26: Updated stale link pointing to incubator url

2022-07-20 Thread GitBox
paliwalashish commented on PR #26: URL: https://github.com/apache/parquet-site/pull/26#issuecomment-1190591595 @shangxinli kindly review when time permits -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [parquet-mr] steveloughran commented on pull request #973: PARQUET-2155: Upgrade protobuf version to 3.17.3

2022-07-20 Thread GitBox
steveloughran commented on PR #973: URL: https://github.com/apache/parquet-mr/pull/973#issuecomment-1190501542 now this is merged in, should the jira be closed? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [parquet-mr] steveloughran commented on pull request #976: PARQUET-2158: Upgrade Hadoop dependency to version 3.2.0

2022-07-20 Thread GitBox
steveloughran commented on PR #976: URL: https://github.com/apache/parquet-mr/pull/976#issuecomment-1190483397 thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [parquet-site] shangxinli commented on pull request #28: Add instructions about how to subscribe to dev list

2022-07-20 Thread GitBox
shangxinli commented on PR #28: URL: https://github.com/apache/parquet-site/pull/28#issuecomment-1190390040 Thanks Vinoo -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [parquet-site] shangxinli merged pull request #28: Add instructions about how to subscribe to dev list

2022-07-20 Thread GitBox
shangxinli merged PR #28: URL: https://github.com/apache/parquet-site/pull/28 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [parquet-site] vinooganesh commented on pull request #28: Add instructions about how to subscribe to dev list

2022-07-20 Thread GitBox
vinooganesh commented on PR #28: URL: https://github.com/apache/parquet-site/pull/28#issuecomment-1190259713 Done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [parquet-site] dossett commented on pull request #28: Add instructions about how to subscribe to dev list

2022-07-20 Thread GitBox
dossett commented on PR #28: URL: https://github.com/apache/parquet-site/pull/28#issuecomment-1190236161 Maybe add a link to the archive as well? https://lists.apache.org/list.html?dev@parquet.apache.org -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [parquet-mr] ala commented on pull request #978: PARQUET-2161: Fix row index generation in combination with range filtering

2022-07-20 Thread GitBox
ala commented on PR #978: URL: https://github.com/apache/parquet-mr/pull/978#issuecomment-1190057981 @ggershinsky Do you know when the next release that will include the fix might happen? We are looking to unblock https://issues.apache.org/jira/browse/SPARK-39634 in Apache Spark. --

[GitHub] [parquet-site] vinooganesh opened a new pull request, #28: Add instructions about how to subscribe to dev list

2022-07-19 Thread GitBox
vinooganesh opened a new pull request, #28: URL: https://github.com/apache/parquet-site/pull/28 Added instructions on how to subscribe. @shangxinli -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [parquet-mr] steveloughran closed pull request #970: PARQUET-2150: parquet-protobuf to compile on Mac M1

2022-07-19 Thread GitBox
steveloughran closed pull request #970: PARQUET-2150: parquet-protobuf to compile on Mac M1 URL: https://github.com/apache/parquet-mr/pull/970 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [parquet-mr] steveloughran commented on pull request #970: PARQUET-2150: parquet-protobuf to compile on Mac M1

2022-07-19 Thread GitBox
steveloughran commented on PR #970: URL: https://github.com/apache/parquet-mr/pull/970#issuecomment-1189454282 resolved by #973 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [parquet-mr] ggershinsky merged pull request #973: PARQUET-2155: Upgrade protobuf version to 3.17.3

2022-07-19 Thread GitBox
ggershinsky merged PR #973: URL: https://github.com/apache/parquet-mr/pull/973 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [parquet-mr] shangxinli commented on pull request #959: PARQUET-2126: Make cached (de)compressors thread-safe

2022-07-19 Thread GitBox
shangxinli commented on PR #959: URL: https://github.com/apache/parquet-mr/pull/959#issuecomment-1189198163 @theosib-amazon Do you still have time for addressing the feedback? I think we are very close to merge. -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [parquet-mr] shangxinli commented on a diff in pull request #971: PARQUET-2134: Improve binding to ByteBufferReadable

2022-07-19 Thread GitBox
shangxinli commented on code in PR #971: URL: https://github.com/apache/parquet-mr/pull/971#discussion_r924651838 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/HadoopStreams.java: ## @@ -50,51 +46,45 @@ public class HadoopStreams { */ public static

[GitHub] [parquet-mr] shangxinli merged pull request #976: PARQUET-2158: Upgrade Hadoop dependency to version 3.2.0

2022-07-19 Thread GitBox
shangxinli merged PR #976: URL: https://github.com/apache/parquet-mr/pull/976 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [parquet-mr] shangxinli commented on pull request #973: PARQUET-2155: Upgrade protobuf version to 3.17.3

2022-07-19 Thread GitBox
shangxinli commented on PR #973: URL: https://github.com/apache/parquet-mr/pull/973#issuecomment-1189167965 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [parquet-mr] ggershinsky commented on pull request #973: PARQUET-2155: Upgrade protobuf version to 3.17.3

2022-07-18 Thread GitBox
ggershinsky commented on PR #973: URL: https://github.com/apache/parquet-mr/pull/973#issuecomment-1188622982 sure. if no other input by the end of this week, I'll merge it then. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [parquet-mr] steveloughran commented on a diff in pull request #971: PARQUET-2134: Improve binding to ByteBufferReadable

2022-07-18 Thread GitBox
steveloughran commented on code in PR #971: URL: https://github.com/apache/parquet-mr/pull/971#discussion_r923661793 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/HadoopStreams.java: ## @@ -50,51 +46,45 @@ public class HadoopStreams { */ public static

[GitHub] [parquet-mr] sunchao commented on a diff in pull request #971: PARQUET-2134: Improve binding to ByteBufferReadable

2022-07-14 Thread GitBox
sunchao commented on code in PR #971: URL: https://github.com/apache/parquet-mr/pull/971#discussion_r921361565 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/HadoopStreams.java: ## @@ -50,51 +46,45 @@ public class HadoopStreams { */ public static

[GitHub] [parquet-mr] steveloughran commented on a diff in pull request #971: PARQUET-2134: Improve binding to ByteBufferReadable

2022-07-14 Thread GitBox
steveloughran commented on code in PR #971: URL: https://github.com/apache/parquet-mr/pull/971#discussion_r921125471 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/HadoopStreams.java: ## @@ -50,51 +46,45 @@ public class HadoopStreams { */ public static

[GitHub] [parquet-mr] steveloughran commented on a diff in pull request #971: PARQUET-2134: Improve binding to ByteBufferReadable

2022-07-14 Thread GitBox
steveloughran commented on code in PR #971: URL: https://github.com/apache/parquet-mr/pull/971#discussion_r921124617 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/HadoopStreams.java: ## @@ -50,51 +46,45 @@ public class HadoopStreams { */ public static

[GitHub] [parquet-mr] sunchao commented on a diff in pull request #971: PARQUET-2134: Improve binding to ByteBufferReadable

2022-07-13 Thread GitBox
sunchao commented on code in PR #971: URL: https://github.com/apache/parquet-mr/pull/971#discussion_r920576934 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/HadoopStreams.java: ## @@ -50,51 +46,45 @@ public class HadoopStreams { */ public static

[GitHub] [parquet-mr] sunchao commented on pull request #973: PARQUET-2155: Upgrade protobuf version to 3.17.3

2022-07-13 Thread GitBox
sunchao commented on PR #973: URL: https://github.com/apache/parquet-mr/pull/973#issuecomment-1183753880 gently ping @shangxinli @ggershinsky -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [parquet-mr] dossett commented on pull request #963: PARQUET-1020 Add DynamicMessage writing support

2022-07-13 Thread GitBox
dossett commented on PR #963: URL: https://github.com/apache/parquet-mr/pull/963#issuecomment-1183301646 Terrific, thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [parquet-mr] shangxinli merged pull request #963: PARQUET-1020 Add DynamicMessage writing support

2022-07-13 Thread GitBox
shangxinli merged PR #963: URL: https://github.com/apache/parquet-mr/pull/963 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [parquet-mr] shangxinli commented on pull request #963: PARQUET-1020 Add DynamicMessage writing support

2022-07-13 Thread GitBox
shangxinli commented on PR #963: URL: https://github.com/apache/parquet-mr/pull/963#issuecomment-1183298278 Merged. Thanks again! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [parquet-mr] dossett commented on pull request #963: PARQUET-1020 Add DynamicMessage writing support

2022-07-13 Thread GitBox
dossett commented on PR #963: URL: https://github.com/apache/parquet-mr/pull/963#issuecomment-1183261724 Thank you @shangxinli ! Do you want to merge it now or closer to the next release? -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [parquet-mr] steveloughran opened a new pull request, #979: PARQUET-2165: Remove deprecated PathGlobPattern class

2022-07-12 Thread GitBox
steveloughran opened a new pull request, #979: URL: https://github.com/apache/parquet-mr/pull/979 Remove the deprecated classes PathGlobPattern and DeprecatedFieldProjectionFilter so that Parquet will compile against hadoop 3.x. If a thrift reader is configured to use

[GitHub] [parquet-mr] steveloughran commented on pull request #951: PARQUET-2134: Fix type checking in HadoopStreams.wrap

2022-07-12 Thread GitBox
steveloughran commented on PR #951: URL: https://github.com/apache/parquet-mr/pull/951#issuecomment-1181629067 thanks. created [HADOOP-18336](https://issues.apache.org/jira/browse/HADOOP-18336) tag FSDataInputStream.getWrappedStream() @Public/@Stable to make sure that hadoop code knows

[GitHub] [parquet-site] paliwalashish opened a new pull request, #26: Updated stale link pointing to incubator url

2022-07-10 Thread GitBox
paliwalashish opened a new pull request, #26: URL: https://github.com/apache/parquet-site/pull/26 Updated the url from incubator-parquet-mr to parquet-mr -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [parquet-mr] steveloughran commented on pull request #970: PARQUET-2150: parquet-protobuf to compile on Mac M1

2022-07-07 Thread GitBox
steveloughran commented on PR #970: URL: https://github.com/apache/parquet-mr/pull/970#issuecomment-1177929495 oh, upgrading is better! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [parquet-mr] 7c00 commented on pull request #951: PARQUET-2134: Fix type checking in HadoopStreams.wrap

2022-07-04 Thread GitBox
7c00 commented on PR #951: URL: https://github.com/apache/parquet-mr/pull/951#issuecomment-1173982766 @shangxinli Thank you for reminding me. I have squashed the PR and added @steveloughran as the co-author. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [parquet-mr] mwong38 commented on a diff in pull request #900: PARQUET-2042: Add support for unwrapping common Protobuf wrappers and…

2022-07-04 Thread GitBox
mwong38 commented on code in PR #900: URL: https://github.com/apache/parquet-mr/pull/900#discussion_r912817641 ## parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoSchemaConverter.java: ## @@ -97,6 +127,46 @@ public MessageType convert(Class protobufClass) {

[GitHub] [parquet-mr] shangxinli commented on a diff in pull request #900: PARQUET-2042: Add support for unwrapping common Protobuf wrappers and…

2022-07-03 Thread GitBox
shangxinli commented on code in PR #900: URL: https://github.com/apache/parquet-mr/pull/900#discussion_r912586665 ## parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoSchemaConverter.java: ## @@ -97,6 +127,46 @@ public MessageType convert(Class protobufClass) {

[GitHub] [parquet-mr] mwong38 commented on a diff in pull request #900: PARQUET-2042: Add support for unwrapping common Protobuf wrappers and…

2022-07-03 Thread GitBox
mwong38 commented on code in PR #900: URL: https://github.com/apache/parquet-mr/pull/900#discussion_r912547741 ## parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoSchemaConverter.java: ## @@ -97,6 +127,46 @@ public MessageType convert(Class protobufClass) {

[GitHub] [parquet-mr] mwong38 commented on a diff in pull request #900: PARQUET-2042: Add support for unwrapping common Protobuf wrappers and…

2022-07-03 Thread GitBox
mwong38 commented on code in PR #900: URL: https://github.com/apache/parquet-mr/pull/900#discussion_r912545610 ## parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoSchemaConverter.java: ## @@ -97,6 +127,46 @@ public MessageType convert(Class protobufClass) {

[GitHub] [parquet-mr] mwong38 commented on a diff in pull request #900: PARQUET-2042: Add support for unwrapping common Protobuf wrappers and…

2022-07-03 Thread GitBox
mwong38 commented on code in PR #900: URL: https://github.com/apache/parquet-mr/pull/900#discussion_r912545477 ## parquet-protobuf/pom.xml: ## @@ -57,6 +58,16 @@ protobuf-java ${protobuf.version} + + com.google.protobuf + protobuf-java-util +

[GitHub] [parquet-mr] sunchao commented on pull request #973: PARQUET-2155: Upgrade protobuf version to 3.17.3

2022-07-02 Thread GitBox
sunchao commented on PR #973: URL: https://github.com/apache/parquet-mr/pull/973#issuecomment-1172949094 updated -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [parquet-mr] shangxinli merged pull request #958: PARQUET-2138: Add ShowBloomFilterCommand to parquet-cli

2022-07-02 Thread GitBox
shangxinli merged PR #958: URL: https://github.com/apache/parquet-mr/pull/958 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [parquet-mr] shangxinli commented on pull request #958: PARQUET-2138: Add ShowBloomFilterCommand to parquet-cli

2022-07-02 Thread GitBox
shangxinli commented on PR #958: URL: https://github.com/apache/parquet-mr/pull/958#issuecomment-1172927105 Let's merge it now and we can add column decryption later. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [parquet-mr] shangxinli commented on pull request #963: PARQUET-1020 Add DynamicMessage writing support

2022-07-02 Thread GitBox
shangxinli commented on PR #963: URL: https://github.com/apache/parquet-mr/pull/963#issuecomment-1172925425 Sorry for the late response and thank you @guillaume-fetter and @dossett for the contribution. Yeah, it seems low risk and LGTM. -- This is an automated message from the Apache

[GitHub] [parquet-mr] shangxinli commented on pull request #951: PARQUET-2134: Fix type checking in HadoopStreams.wrap

2022-07-02 Thread GitBox
shangxinli commented on PR #951: URL: https://github.com/apache/parquet-mr/pull/951#issuecomment-1172924045 @7c00 and @steveloughran Thank both of you for the great contribution! This PR comes from two authors. Can @7c00 add @steveloughran as the co-author to this PR?

[GitHub] [parquet-mr] shangxinli commented on pull request #970: PARQUET-2150: parquet-protobuf to compile on Mac M1

2022-07-02 Thread GitBox
shangxinli commented on PR #970: URL: https://github.com/apache/parquet-mr/pull/970#issuecomment-1172920665 @steveloughran Thanks for the explanation! Do you have concerns if we use [PR-973](https://github.com/apache/parquet-mr/pull/973)? It seems we can rely on proto-buf itself to solve

[GitHub] [parquet-mr] shangxinli commented on pull request #973: PARQUET-2155: Upgrade protobuf version to 3.20.1

2022-07-02 Thread GitBox
shangxinli commented on PR #973: URL: https://github.com/apache/parquet-mr/pull/973#issuecomment-1172920087 Yeah, we can do 3.20.1 later. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [parquet-mr] shangxinli commented on pull request #974: PARQUET-2156: Column bloom filter: Show bloom filters in tools

2022-07-02 Thread GitBox
shangxinli commented on PR #974: URL: https://github.com/apache/parquet-mr/pull/974#issuecomment-1172919112 @panbingkun Did you check [PR-958](https://github.com/apache/parquet-mr/pull/958) ? -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [parquet-mr] shangxinli commented on pull request #976: PARQUET-2158: Upgrade Hadoop dependency to version 3.2.0

2022-07-02 Thread GitBox
shangxinli commented on PR #976: URL: https://github.com/apache/parquet-mr/pull/976#issuecomment-1172918478 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [parquet-mr] ggershinsky merged pull request #978: PARQUET-2161: Fix row index generation in combination with range filtering

2022-06-29 Thread GitBox
ggershinsky merged PR #978: URL: https://github.com/apache/parquet-mr/pull/978 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [parquet-mr] ggershinsky commented on pull request #978: PARQUET-2161: Fix row index generation in combination with range filtering

2022-06-29 Thread GitBox
ggershinsky commented on PR #978: URL: https://github.com/apache/parquet-mr/pull/978#issuecomment-1170396062 Thanks @ala -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [parquet-mr] ala commented on pull request #978: PARQUET-2161: Fix row index generation in combination with range filtering

2022-06-29 Thread GitBox
ala commented on PR #978: URL: https://github.com/apache/parquet-mr/pull/978#issuecomment-1170153775 @ggershinsky Thanks for the review. I tweaked the error assertion message to better match the rest of the codebase. -- This is an automated message from the Apache Git Service. To respond

[GitHub] [parquet-mr] ggershinsky commented on pull request #978: PARQUET-2161: Fix row index generation in combination with range filtering

2022-06-29 Thread GitBox
ggershinsky commented on PR #978: URL: https://github.com/apache/parquet-mr/pull/978#issuecomment-1169963727 Thanks @chenjunjiedada . @ala , please handle the message comment, and I'll merge this PR. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [parquet-mr] chenjunjiedada commented on pull request #978: PARQUET-2161: Fix row index generation in combination with range filtering

2022-06-29 Thread GitBox
chenjunjiedada commented on PR #978: URL: https://github.com/apache/parquet-mr/pull/978#issuecomment-1169957186 This looks correct to me. The logic also exists in the iceberg row position reader. See: https://github.com/apache/iceberg/pull/1254#discussion_r461893642. -- This is an

[GitHub] [parquet-mr] chenjunjiedada commented on a diff in pull request #978: PARQUET-2161: Fix row index generation in combination with range filtering

2022-06-29 Thread GitBox
chenjunjiedada commented on code in PR #978: URL: https://github.com/apache/parquet-mr/pull/978#discussion_r909597173 ## parquet-hadoop/src/test/java/org/apache/parquet/filter2/recordlevel/PhoneBookWriter.java: ## @@ -359,7 +359,7 @@ public static List

[GitHub] [parquet-mr] ggershinsky commented on pull request #978: PARQUET-2161: Fix row index generation in combination with range filtering

2022-06-28 Thread GitBox
ggershinsky commented on PR #978: URL: https://github.com/apache/parquet-mr/pull/978#issuecomment-1168681667 Yep, I remember reviewing that PR. @prakharjain09 , can you also have a look at this fix? -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [parquet-mr] steveloughran commented on pull request #976: PARQUET-2158: Upgrade Hadoop dependency to version 3.2.0

2022-06-27 Thread GitBox
steveloughran commented on PR #976: URL: https://github.com/apache/parquet-mr/pull/976#issuecomment-1167200968 i will do a separate PR to remove `PathGlobPattern`; not this week though. It is used in DeprecatedFieldProjectionFilter, and that is used in

[GitHub] [parquet-mr] ala commented on pull request #978: PARQUET-2161: Fix row index generation in combination with range filtering

2022-06-24 Thread GitBox
ala commented on PR #978: URL: https://github.com/apache/parquet-mr/pull/978#issuecomment-1165708679 cc @ggershinsky -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [parquet-mr] steveloughran commented on a diff in pull request #976: PARQUET-2158: Upgrade Hadoop dependency to version 3.2.0

2022-06-24 Thread GitBox
steveloughran commented on code in PR #976: URL: https://github.com/apache/parquet-mr/pull/976#discussion_r905967944 ## pom.xml: ## @@ -76,7 +76,7 @@ 2.13.2.2 0.14.2 shaded.parquet -2.10.1 +3.2.0 Review Comment: I was being unambitious. move to this,

[GitHub] [parquet-mr] steveloughran commented on a diff in pull request #976: PARQUET-2158: Upgrade Hadoop dependency to version 3.2.0

2022-06-24 Thread GitBox
steveloughran commented on code in PR #976: URL: https://github.com/apache/parquet-mr/pull/976#discussion_r905965620 ## parquet-thrift/src/main/java/org/apache/parquet/thrift/projection/deprecated/PathGlobPattern.java: ## @@ -20,8 +20,8 @@ import

[GitHub] [parquet-mr] ala commented on pull request #978: PARQUET-2161: Fix row index generation in combination with range filtering

2022-06-23 Thread GitBox
ala commented on PR #978: URL: https://github.com/apache/parquet-mr/pull/978#issuecomment-1164263841 cc @shangxinli This is a small follow-up bug fix for https://github.com/apache/parquet-mr/pull/945 -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [parquet-mr] ggershinsky commented on a diff in pull request #968: PARQUET-2149: Async IO implementation for ParquetFileReader

2022-06-22 Thread GitBox
ggershinsky commented on code in PR #968: URL: https://github.com/apache/parquet-mr/pull/968#discussion_r903697361 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileReader.java: ## @@ -1796,5 +1882,314 @@ public void readAll(SeekableInputStream f,

[GitHub] [parquet-mr] ggershinsky commented on a diff in pull request #968: PARQUET-2149: Async IO implementation for ParquetFileReader

2022-06-22 Thread GitBox
ggershinsky commented on code in PR #968: URL: https://github.com/apache/parquet-mr/pull/968#discussion_r903693351 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileReader.java: ## @@ -1796,5 +1882,314 @@ public void readAll(SeekableInputStream f,

[GitHub] [parquet-mr] ggershinsky commented on a diff in pull request #968: PARQUET-2149: Async IO implementation for ParquetFileReader

2022-06-22 Thread GitBox
ggershinsky commented on code in PR #968: URL: https://github.com/apache/parquet-mr/pull/968#discussion_r903595526 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileReader.java: ## @@ -1796,5 +1882,314 @@ public void readAll(SeekableInputStream f,

[GitHub] [parquet-mr] ggershinsky commented on a diff in pull request #968: PARQUET-2149: Async IO implementation for ParquetFileReader

2022-06-22 Thread GitBox
ggershinsky commented on code in PR #968: URL: https://github.com/apache/parquet-mr/pull/968#discussion_r903592957 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileReader.java: ## @@ -126,6 +127,42 @@ public class ParquetFileReader implements Closeable {

[GitHub] [parquet-mr] ggershinsky commented on a diff in pull request #968: PARQUET-2149: Async IO implementation for ParquetFileReader

2022-06-22 Thread GitBox
ggershinsky commented on code in PR #968: URL: https://github.com/apache/parquet-mr/pull/968#discussion_r903469988 ## parquet-hadoop/src/main/java/org/apache/parquet/crypto/InternalFileDecryptor.java: ## @@ -61,10 +61,7 @@ public InternalFileDecryptor(FileDecryptionProperties

[GitHub] [parquet-mr] ggershinsky commented on a diff in pull request #968: PARQUET-2149: Async IO implementation for ParquetFileReader

2022-06-22 Thread GitBox
ggershinsky commented on code in PR #968: URL: https://github.com/apache/parquet-mr/pull/968#discussion_r903403821 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileReader.java: ## @@ -1796,5 +1882,314 @@ public void readAll(SeekableInputStream f,

[GitHub] [parquet-mr] ggershinsky commented on a diff in pull request #968: PARQUET-2149: Async IO implementation for ParquetFileReader

2022-06-22 Thread GitBox
ggershinsky commented on code in PR #968: URL: https://github.com/apache/parquet-mr/pull/968#discussion_r90337 ## parquet-common/src/main/java/org/apache/parquet/bytes/AsyncMultiBufferInputStream.java: ## @@ -0,0 +1,158 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [parquet-mr] parthchandra commented on pull request #968: PARQUET-2149: Async IO implementation for ParquetFileReader

2022-06-20 Thread GitBox
parthchandra commented on PR #968: URL: https://github.com/apache/parquet-mr/pull/968#issuecomment-1160693399 @shangxinli Thank you for the review! I'll address these comments asap. I am reviewing the thread pool and its initialization. IMO, it is better if there is no default

[GitHub] [parquet-mr] steveloughran commented on a diff in pull request #968: PARQUET-2149: Async IO implementation for ParquetFileReader

2022-06-20 Thread GitBox
steveloughran commented on code in PR #968: URL: https://github.com/apache/parquet-mr/pull/968#discussion_r901859862 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileReader.java: ## @@ -126,6 +127,42 @@ public class ParquetFileReader implements Closeable {

[GitHub] [parquet-mr] steveloughran commented on a diff in pull request #951: PARQUET-2134: Fix type checking in HadoopStreams.wrap

2022-06-20 Thread GitBox
steveloughran commented on code in PR #951: URL: https://github.com/apache/parquet-mr/pull/951#discussion_r901856428 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/HadoopStreams.java: ## @@ -50,51 +46,45 @@ public class HadoopStreams { */ public static

[GitHub] [parquet-mr] steveloughran commented on pull request #970: PARQUET-2150: parquet-protobuf to compile on Mac M1

2022-06-20 Thread GitBox
steveloughran commented on PR #970: URL: https://github.com/apache/parquet-mr/pull/970#issuecomment-1160638077 this patch is based on Dongjoon;s one for hadoop, tells maven to use the x86 artifact on macbook m1 builds. the sunchao one switches to a version of protobuf with a genuine

[GitHub] [parquet-mr] theosib-amazon commented on a diff in pull request #957: PARQUET-2069: Allow list and array record types to be compatible.

2022-06-20 Thread GitBox
theosib-amazon commented on code in PR #957: URL: https://github.com/apache/parquet-mr/pull/957#discussion_r901748898 ## parquet-avro/src/main/java/org/apache/parquet/avro/AvroReadSupport.java: ## @@ -136,10 +137,22 @@ public RecordMaterializer prepareForRead(

[GitHub] [parquet-mr] theosib-amazon commented on a diff in pull request #957: PARQUET-2069: Allow list and array record types to be compatible.

2022-06-20 Thread GitBox
theosib-amazon commented on code in PR #957: URL: https://github.com/apache/parquet-mr/pull/957#discussion_r901740673 ## parquet-avro/src/test/java/org/apache/parquet/avro/TestArrayListCompatibility.java: ## @@ -0,0 +1,51 @@ +/** + * Licensed to the Apache Software Foundation

[GitHub] [parquet-mr] theosib-amazon commented on a diff in pull request #957: PARQUET-2069: Allow list and array record types to be compatible.

2022-06-20 Thread GitBox
theosib-amazon commented on code in PR #957: URL: https://github.com/apache/parquet-mr/pull/957#discussion_r901733632 ## parquet-avro/src/main/java/org/apache/parquet/avro/AvroReadSupport.java: ## @@ -136,10 +137,22 @@ public RecordMaterializer prepareForRead(

[GitHub] [parquet-mr] ala opened a new pull request, #978: PARQUET-2161: Fix row index generation in combination with range filtering

2022-06-20 Thread GitBox
ala opened a new pull request, #978: URL: https://github.com/apache/parquet-mr/pull/978 Make sure you have checked _all_ steps below. ### Jira - [x] My PR addresses the following [Parquet Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references them in the

<    2   3   4   5   6   7   8   9   10   11   >