[jira] [Commented] (PARQUET-2122) Adding Bloom filter to small Parquet file bloats in size X1700

2022-02-14 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17492418#comment-17492418 ] Junjie Chen commented on PARQUET-2122: -- That's the default size of the bloom filter. Please

[jira] [Commented] (PARQUET-1992) Cannot build from tarball because of git submodules

2021-03-01 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17293285#comment-17293285 ] Junjie Chen commented on PARQUET-1992: -- How about make the related tests required in Travis CI

Re: [VOTE] Release Apache Parquet 1.12.0 RC2

2021-03-01 Thread Junjie Chen
Hi I downloaded the package and ran mvn clean install, it failed with message: [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2.1:exec (git submodule update) on project parquet-format-structures: Command execution failed.: Process exited with an error: 128 (Exit value: 128)

[jira] [Comment Edited] (PARQUET-1805) Refactor the configuration for bloom filters

2021-02-01 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276198#comment-17276198 ] Junjie Chen edited comment on PARQUET-1805 at 2/1/21, 9:43 AM: --- I think

[jira] [Commented] (PARQUET-1805) Refactor the configuration for bloom filters

2021-02-01 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276198#comment-17276198 ] Junjie Chen commented on PARQUET-1805: -- I think what [~yumwang] concern is we enable all columns

[jira] [Commented] (PARQUET-1851) ParquetMetadataConveter throws NPE in an Iceberg unit test

2020-12-02 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242217#comment-17242217 ] Junjie Chen commented on PARQUET-1851: -- Even the client doesn't write data, we should not throw

[jira] [Reopened] (PARQUET-1851) ParquetMetadataConveter throws NPE in an Iceberg unit test

2020-12-02 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen reopened PARQUET-1851: -- > ParquetMetadataConveter throws NPE in an Iceberg unit t

[jira] [Resolved] (PARQUET-1851) ParquetMetadataConveter throws NPE in an Iceberg unit test

2020-07-08 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen resolved PARQUET-1851. -- Resolution: Not A Bug This is due to the client didn't write data successfully

Re: Parquet - 41

2020-05-13 Thread Junjie Chen
Indices ( PARQUET-1404< > https://issues.apache.org/jira/projects/PARQUET/issues/PARQUET-1404?filter=allopenissues> > ) > > Regards > Arun Balajiee > > > From: Junjie Chen > Sent: 20 April 2020 22:20 > To: dev@parquet.apache.org &

[jira] [Updated] (PARQUET-1851) ParquetMetadataConveter throws NPE in an Iceberg unit test

2020-04-28 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen updated PARQUET-1851: - Description: When writing data to parquet in an Iceberg unit test, it throws NPE as below

[jira] [Created] (PARQUET-1851) ParquetMetadataConveter throws NPE in an Iceberg unit test

2020-04-28 Thread Junjie Chen (Jira)
Junjie Chen created PARQUET-1851: Summary: ParquetMetadataConveter throws NPE in an Iceberg unit test Key: PARQUET-1851 URL: https://issues.apache.org/jira/browse/PARQUET-1851 Project: Parquet

Re: Filtering GitBox e-mails out of dev@?

2020-04-25 Thread Junjie Chen
Do we need to start a vote on this? Maybe set up a new mail list and route to it? On Tue, Apr 21, 2020 at 8:58 PM Junjie Chen wrote: > Just open a Jira and copy .asf.yaml in PR: > https://github.com/apache/parquet-mr/pull/788. > > On Tue, Apr 21, 2020 at 8:28 PM Wes McKinney wr

[jira] [Commented] (PARQUET-1327) [C++] Bloom filter read/write implementation

2020-04-23 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091083#comment-17091083 ] Junjie Chen commented on PARQUET-1327: -- The Bloom filter in the MR side is used for row group

Re: Filtering GitBox e-mails out of dev@?

2020-04-21 Thread Junjie Chen
Just open a Jira and copy .asf.yaml in PR: https://github.com/apache/parquet-mr/pull/788. On Tue, Apr 21, 2020 at 8:28 PM Wes McKinney wrote: > hi, > > Would someone please take a look at this? > > Thanks > > On Mon, Apr 20, 2020 at 8:08 AM Wes McKinney wrote: > > > > Infra made some changes

[jira] [Created] (PARQUET-1847) Filter out github notification from dev mail list

2020-04-21 Thread Junjie Chen (Jira)
Junjie Chen created PARQUET-1847: Summary: Filter out github notification from dev mail list Key: PARQUET-1847 URL: https://issues.apache.org/jira/browse/PARQUET-1847 Project: Parquet Issue

Re: Parquet - 41

2020-04-20 Thread Junjie Chen
As far as I know, not implemented yet. The thrift is update-to-date now, would you like to contribute? Things we need are: 1. xxhash c++ implementation 2. reader and writer for the bloom filter 3. filtering logic for row group Implementing the reader would be a good start. On Tue, Apr 21, 2020

[jira] [Commented] (PARQUET-1815) Add union API to BloomFilter interface

2020-03-13 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17058571#comment-17058571 ] Junjie Chen commented on PARQUET-1815: -- I think you can go ahead:) > Add union API to BloomFil

Re: [Announce] new committer: Xinli Shang

2020-03-12 Thread Junjie Chen
Congrats! On Fri, Mar 13, 2020 at 5:01 AM Driesprong, Fokko wrote: > Great to have you onboard Xinli, welcome! > > Cheers, Fokko > > Op do 12 mrt. 2020 om 21:50 schreef Julien Le Dem >: > > > The Project Management Committee (PMC) for Apache Parquet > > has invited Xinli Shang to become a

[jira] [Created] (PARQUET-1815) Add union API to BloomFilter interface

2020-03-12 Thread Junjie Chen (Jira)
Junjie Chen created PARQUET-1815: Summary: Add union API to BloomFilter interface Key: PARQUET-1815 URL: https://issues.apache.org/jira/browse/PARQUET-1815 Project: Parquet Issue Type

[jira] [Resolved] (PARQUET-41) Add bloom filters to parquet statistics

2020-02-26 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen resolved PARQUET-41. Fix Version/s: 1.11.1 Resolution: Fixed > Add bloom filters to parquet statist

[jira] [Assigned] (PARQUET-1453) Support nested column Bloom filter

2020-02-26 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen reassigned PARQUET-1453: Assignee: Junjie Chen > Support nested column Bloom fil

[jira] [Resolved] (PARQUET-1453) Support nested column Bloom filter

2020-02-26 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen resolved PARQUET-1453. -- Fix Version/s: 1.11.1 Resolution: Fixed > Support nested column Bloom fil

[jira] [Resolved] (PARQUET-1328) [java]Bloom filter read/write implementation

2020-02-26 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen resolved PARQUET-1328. -- Fix Version/s: 1.11.1 Resolution: Fixed > [java]Bloom filter read/wr

[jira] [Resolved] (PARQUET-1516) Store Bloom filters near to footer.

2020-02-26 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen resolved PARQUET-1516. -- Fix Version/s: 1.11.1 Assignee: Junjie Chen Resolution: Fixed > Store Bl

[jira] [Resolved] (PARQUET-1795) merge bloom filter feature branch to master

2020-02-26 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen resolved PARQUET-1795. -- Resolution: Not A Problem > merge bloom filter feature branch to mas

[jira] [Assigned] (PARQUET-1795) merge bloom filter feature branch to master

2020-02-26 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen reassigned PARQUET-1795: Assignee: Junjie Chen > merge bloom filter feature branch to mas

[jira] [Resolved] (PARQUET-1391) [java] Integrate Bloom filter logic

2020-02-26 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen resolved PARQUET-1391. -- Fix Version/s: 1.11.1 Assignee: Junjie Chen Resolution: Fixed > [j

[jira] [Created] (PARQUET-1795) merge bloom filter feature branch to master

2020-02-12 Thread Junjie Chen (Jira)
Junjie Chen created PARQUET-1795: Summary: merge bloom filter feature branch to master Key: PARQUET-1795 URL: https://issues.apache.org/jira/browse/PARQUET-1795 Project: Parquet Issue Type

Re: [DISCUSS] merge bloom filter branch to master

2020-02-11 Thread Junjie Chen
sting the bloom filter branch with Map Reduce examples and it > sounds good. > I'll add an example to the parquet-hadoop package. > > Cheers, > Walid > > Le ven. 10 janv. 2020 à 09:22, Junjie Chen a > écrit : > > > Hi Community > > > > The bloom filter

[jira] [Commented] (PARQUET-1758) InternalParquetRecordReader Logging it Too Verbose

2020-01-12 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013979#comment-17013979 ] Junjie Chen commented on PARQUET-1758: -- It might be better to draft a discussion on mail list

[DISCUSS] merge bloom filter branch to master

2020-01-10 Thread Junjie Chen
Hi Community The bloom filter branch now contains a basic functional logic that includes read/write and filtering. Though the feature still needs polishing and improving, I 'd suggest merging back to master first so that more people could use it and provide feedback. What do you think?

Re: [VOTE] Release Apache Parquet Format 2.8.0 RC0

2020-01-09 Thread Junjie Chen
+1 (non-binding) On Wed, Jan 8, 2020 at 5:24 PM Gabor Szadovszky wrote: > Thanks, Ryan for highlighting this. With your vote we have the required > three +1 binding votes. > Let's wait for a couple of days if anyone has a problem with the thrift > compatibility or anything else. > I'll finalize

Re: Spotless

2020-01-08 Thread Junjie Chen
I see spotless could handle import order and remove unused import as well, we can make use of them step by step. +1 to use spotless. On Thu, Jan 9, 2020 at 2:36 AM Ryan Blue wrote: > +1 for spotless checks. > > On Wed, Jan 8, 2020 at 7:13 AM Driesprong, Fokko > wrote: > > > Y'all, > > > >

[jira] [Created] (PARQUET-1741) APIs backward compatibility issues cause master branch build failure

2020-01-08 Thread Junjie Chen (Jira)
Junjie Chen created PARQUET-1741: Summary: APIs backward compatibility issues cause master branch build failure Key: PARQUET-1741 URL: https://issues.apache.org/jira/browse/PARQUET-1741 Project

[jira] [Created] (PARQUET-1733) [java]keep ColumnChunkPageWriteStore constructor from 1.10.1

2019-12-30 Thread Junjie Chen (Jira)
Junjie Chen created PARQUET-1733: Summary: [java]keep ColumnChunkPageWriteStore constructor from 1.10.1 Key: PARQUET-1733 URL: https://issues.apache.org/jira/browse/PARQUET-1733 Project: Parquet

Re: [VOTE] Release Apache Parquet 1.11.0 RC7

2019-11-15 Thread Junjie Chen
+1 Verified signature, checksum and ran mvn install successfully. Wang, Yuming 于2019年11月14日周四 下午2:05写道: > > +1 > Tested Parquet 1.11.0 with Spark SQL module: build/sbt "sql/test-only" > -Phadoop-3.2 > > On 2019/11/13, 21:33, "Gabor Szadovszky" wrote: > > Hi everyone, > > I propose the

Re: [VOTE] Add BYTE_STREAM_SPLIT encoding to Apache Parquet

2019-11-07 Thread Junjie Chen
+1 from me to add BYTE_STREAM_SPLIT to parquet-format. Gabor Szadovszky 于2019年11月7日周四 下午6:07写道: > > +1 for adding BYTE_STREAM_SPLIT encoding to parquet-format. > > On Tue, Nov 5, 2019 at 11:22 PM Wes McKinney wrote: > > > +1 from me on adding the FP encoding > > > > On Sat, Nov 2, 2019 at 4:51

Re: release process - using rc tags

2019-10-30 Thread Junjie Chen
+1 XU Qinghui 于2019年10月31日周四 上午6:35写道: > > +1 > > Le mer. 30 oct. 2019 à 16:53, Driesprong, Fokko a > écrit : > > > +1 > > > > Op wo 30 okt. 2019 om 16:51 schreef Ryan Blue : > > > > > +1 > > > > > > I recently built the release process for Iceberg and that's what we > > decided > > > to go

Re: Stalebot

2019-10-23 Thread Junjie Chen
Sounds good to have it. We might want to set the expiration limit to a larger value according to commit history. Driesprong, Fokko 于2019年10月23日周三 下午9:32写道: > > Hi all, > > I would suggest enabling Stalebot on the parquet-mr repo: > https://probot.github.io/apps/stale/ > > Right now we have a lot

[jira] [Updated] (PARQUET-319) Define the parquet bloom filter statistics in parquet format

2019-10-10 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen updated PARQUET-319: Fix Version/s: format-2.7.0 > Define the parquet bloom filter statistics in parquet for

[jira] [Created] (PARQUET-1658) travis preparing script for bloom-filter branch failed

2019-09-20 Thread Junjie Chen (Jira)
Junjie Chen created PARQUET-1658: Summary: travis preparing script for bloom-filter branch failed Key: PARQUET-1658 URL: https://issues.apache.org/jira/browse/PARQUET-1658 Project: Parquet

[jira] [Commented] (PARQUET-1657) [C++] Change Bloom filter implementation to use xxhash

2019-09-18 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932998#comment-16932998 ] Junjie Chen commented on PARQUET-1657: -- Great, the Bloom filter thrift definition was agreed

[jira] [Resolved] (PARQUET-1617) Add more details to bloom filter spec

2019-09-09 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen resolved PARQUET-1617. -- Resolution: Fixed > Add more details to bloom filter s

[jira] [Resolved] (PARQUET-1609) support xxhash in bloom filter

2019-09-09 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen resolved PARQUET-1609. -- Resolution: Fixed > support xxhash in bloom fil

[jira] [Commented] (PARQUET-1570) Publish 1.11.0 to maven central

2019-09-02 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921113#comment-16921113 ] Junjie Chen commented on PARQUET-1570: -- We may need to resolve PARQUET-1434 at first. > Publ

[jira] [Assigned] (PARQUET-1592) update hash naming of bloom filter

2019-08-30 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen reassigned PARQUET-1592: Assignee: Junjie Chen > update hash naming of bloom fil

[jira] [Resolved] (PARQUET-1630) Resolve Bloom filter spec concerns

2019-08-30 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen resolved PARQUET-1630. -- Resolution: Fixed > Resolve Bloom filter spec conce

[jira] [Resolved] (PARQUET-1592) update hash naming of bloom filter

2019-08-30 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen resolved PARQUET-1592. -- Resolution: Fixed > update hash naming of bloom fil

[jira] [Assigned] (PARQUET-1630) Resolve Bloom filter spec concerns

2019-08-30 Thread Junjie Chen (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen reassigned PARQUET-1630: Assignee: Junjie Chen > Resolve Bloom filter spec conce

[jira] [Commented] (PARQUET-1632) Negative initial size when writing large values in parquet-mr

2019-08-09 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16903870#comment-16903870 ] Junjie Chen commented on PARQUET-1632: -- Reopen this first.   I think the ByteInput get from

[jira] [Resolved] (PARQUET-1632) Negative initial size when writing large values in parquet-mr

2019-08-08 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen resolved PARQUET-1632. -- Resolution: Not A Problem it is a configuration issue. > Negative initial size when writ

[jira] [Commented] (PARQUET-1632) Negative initial size when writing large values in parquet-mr

2019-08-08 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16902820#comment-16902820 ] Junjie Chen commented on PARQUET-1632: -- The CapacityByteArrayOutputStream is overflowed since

[jira] [Assigned] (PARQUET-1632) Negative initial size when writing large values in parquet-mr

2019-08-07 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen reassigned PARQUET-1632: Assignee: Junjie Chen > Negative initial size when writing large values in parquet

[jira] [Commented] (PARQUET-1632) Negative initial size when writing large values in parquet-mr

2019-08-07 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901892#comment-16901892 ] Junjie Chen commented on PARQUET-1632: -- I will take a look into this. > Negative initial s

[jira] [Created] (PARQUET-1630) Resolve Bloom filter spec concerns

2019-08-04 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1630: Summary: Resolve Bloom filter spec concerns Key: PARQUET-1630 URL: https://issues.apache.org/jira/browse/PARQUET-1630 Project: Parquet Issue Type: Sub-task

[jira] [Commented] (PARQUET-1326) [C++] Cross compatibility support with parquet-mr

2019-07-31 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897323#comment-16897323 ] Junjie Chen commented on PARQUET-1326: -- We need to consider to make the integration test

[jira] [Commented] (PARQUET-1434) Release parquet-mr 1.11.0

2019-07-22 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16890604#comment-16890604 ] Junjie Chen commented on PARQUET-1434: -- [~gszadovszky],  What remaining contents are still

[jira] [Created] (PARQUET-1625) Update parquet thrift to align with spec

2019-07-15 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1625: Summary: Update parquet thrift to align with spec Key: PARQUET-1625 URL: https://issues.apache.org/jira/browse/PARQUET-1625 Project: Parquet Issue Type: Sub

[jira] [Created] (PARQUET-1617) Add more details to bloom filter spec

2019-07-05 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1617: Summary: Add more details to bloom filter spec Key: PARQUET-1617 URL: https://issues.apache.org/jira/browse/PARQUET-1617 Project: Parquet Issue Type

[jira] [Created] (PARQUET-1609) support xxhash in bloom filter

2019-06-25 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1609: Summary: support xxhash in bloom filter Key: PARQUET-1609 URL: https://issues.apache.org/jira/browse/PARQUET-1609 Project: Parquet Issue Type: Improvement

[jira] [Updated] (PARQUET-1552) upgrade protoc-jar-maven-plugin to 3.8.0

2019-06-20 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen updated PARQUET-1552: - Description: Current protoc-jar-maven-plugin has a problem when building project after a proxy

[jira] [Commented] (PARQUET-1552) upgrade protoc-jar-maven-plugin to 3.8.0

2019-06-20 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869108#comment-16869108 ] Junjie Chen commented on PARQUET-1552: -- v3.7.0.1 does not fix the problem, v3.8.0 fix

[jira] [Updated] (PARQUET-1552) upgrade protoc-jar-maven-plugin to 3.8.0

2019-06-20 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen updated PARQUET-1552: - Summary: upgrade protoc-jar-maven-plugin to 3.8.0 (was: upgrade protoc-jar-maven-plugin

[jira] [Created] (PARQUET-1592) update hash naming of bloom filter

2019-06-11 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1592: Summary: update hash naming of bloom filter Key: PARQUET-1592 URL: https://issues.apache.org/jira/browse/PARQUET-1592 Project: Parquet Issue Type: Sub-task

[jira] [Created] (PARQUET-1553) Support xxHash in Bloom filter

2019-04-02 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1553: Summary: Support xxHash in Bloom filter Key: PARQUET-1553 URL: https://issues.apache.org/jira/browse/PARQUET-1553 Project: Parquet Issue Type: New Feature

[jira] [Created] (PARQUET-1552) upgrade protoc-jar-maven-plugin to 3.7.0.1

2019-03-28 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1552: Summary: upgrade protoc-jar-maven-plugin to 3.7.0.1 Key: PARQUET-1552 URL: https://issues.apache.org/jira/browse/PARQUET-1552 Project: Parquet Issue Type

[jira] [Assigned] (PARQUET-319) Define the parquet bloom filter statistics in parquet format

2019-02-14 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen reassigned PARQUET-319: --- Assignee: Junjie Chen (was: Ferdinand Xu) > Define the parquet bloom filter statist

[jira] [Created] (PARQUET-1516) Store Bloom filters near to footer.

2019-01-27 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1516: Summary: Store Bloom filters near to footer. Key: PARQUET-1516 URL: https://issues.apache.org/jira/browse/PARQUET-1516 Project: Parquet Issue Type: Sub-task

[jira] [Created] (PARQUET-1495) Perform encoding before bloom filters write out

2019-01-21 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1495: Summary: Perform encoding before bloom filters write out Key: PARQUET-1495 URL: https://issues.apache.org/jira/browse/PARQUET-1495 Project: Parquet Issue

[jira] [Commented] (PARQUET-1493) maven protobuf plugin not work properly

2019-01-16 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744050#comment-16744050 ] Junjie Chen commented on PARQUET-1493: -- Thanks I open a issue there: https://github.com/os72

[jira] [Comment Edited] (PARQUET-1493) maven protobuf plugin not work properly

2019-01-16 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744019#comment-16744019 ] Junjie Chen edited comment on PARQUET-1493 at 1/16/19 1:12 PM: --- Hi

[jira] [Commented] (PARQUET-1493) maven protobuf plugin not work properly

2019-01-16 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744019#comment-16744019 ] Junjie Chen commented on PARQUET-1493: -- Hi [~gszadovszky] I just tried 3.6.0.2, still failed

[jira] [Updated] (PARQUET-1493) maven protobuf plugin not work properly

2019-01-16 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen updated PARQUET-1493: - Description: I checked out master branch and executed "mvn clean install -Dskip

[jira] [Created] (PARQUET-1493) maven protobuf plugin not work properly

2019-01-16 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1493: Summary: maven protobuf plugin not work properly Key: PARQUET-1493 URL: https://issues.apache.org/jira/browse/PARQUET-1493 Project: Parquet Issue Type: Bug

[jira] [Commented] (PARQUET-1328) [java]Bloom filter read/write implementation

2019-01-11 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16740965#comment-16740965 ] Junjie Chen commented on PARQUET-1328: -- [~zi], Jim had reviewed some on this and we need some more

[jira] [Created] (PARQUET-1453) Support nested column Bloom filter

2018-10-30 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1453: Summary: Support nested column Bloom filter Key: PARQUET-1453 URL: https://issues.apache.org/jira/browse/PARQUET-1453 Project: Parquet Issue Type: Sub-task

[jira] [Created] (PARQUET-1391) [java] Integrate Bloom filter logic

2018-08-19 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1391: Summary: [java] Integrate Bloom filter logic Key: PARQUET-1391 URL: https://issues.apache.org/jira/browse/PARQUET-1391 Project: Parquet Issue Type: Sub-task

[jira] [Updated] (PARQUET-1329) [C++] Integrate Bloom filter into row group filter logic

2018-08-19 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen updated PARQUET-1329: - Summary: [C++] Integrate Bloom filter into row group filter logic (was: integrate parquet

[jira] [Updated] (PARQUET-1328) [java]Bloom filter read/write implementation

2018-08-19 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen updated PARQUET-1328: - Summary: [java]Bloom filter read/write implementation (was: parquet bloom filter writer

[jira] [Assigned] (PARQUET-1328) [java]Bloom filter read/write implementation

2018-08-19 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen reassigned PARQUET-1328: Assignee: Junjie Chen > [java]Bloom filter read/write implementat

[jira] [Updated] (PARQUET-1327) [C++]Bloom filter read/write implementation

2018-08-19 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen updated PARQUET-1327: - Summary: [C++]Bloom filter read/write implementation (was: parquet bloom filter reader

[jira] [Commented] (PARQUET-1385) [C++] bloom_filter-test is very slow under valgrind

2018-08-17 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583489#comment-16583489 ] Junjie Chen commented on PARQUET-1385: -- std::seed_seq::generate takes more than 75% cpu cycles

[jira] [Commented] (PARQUET-1385) [C++] bloom_filter-test is very slow under valgrind

2018-08-17 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583425#comment-16583425 ] Junjie Chen commented on PARQUET-1385: -- The GetRandomString function is very slow, I can change

[jira] [Commented] (PARQUET-1380) move Bloom filter test binary to parquet-testing repo

2018-08-15 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581758#comment-16581758 ] Junjie Chen commented on PARQUET-1380: -- Hi [~wesmckinn], I created this to track following thing

[jira] [Created] (PARQUET-1380) move Bloom filter test binary to parquet-testing repo

2018-08-15 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1380: Summary: move Bloom filter test binary to parquet-testing repo Key: PARQUET-1380 URL: https://issues.apache.org/jira/browse/PARQUET-1380 Project: Parquet

[jira] [Created] (PARQUET-1377) [C++] replace shared_ptr to unique_ptr in Bloom filter buffer allocation

2018-08-10 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1377: Summary: [C++] replace shared_ptr to unique_ptr in Bloom filter buffer allocation Key: PARQUET-1377 URL: https://issues.apache.org/jira/browse/PARQUET-1377 Project

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-07-19 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16550201#comment-16550201 ] Junjie Chen commented on PARQUET-41: [~aniket486], Thanks for watching this. Yes, I 'm still

[jira] [Created] (PARQUET-1342) Add bloom filter utility class

2018-06-29 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1342: Summary: Add bloom filter utility class Key: PARQUET-1342 URL: https://issues.apache.org/jira/browse/PARQUET-1342 Project: Parquet Issue Type: Sub-task

[jira] [Updated] (PARQUET-1332) [C++] Add bloom filter utility class

2018-06-29 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen updated PARQUET-1332: - Component/s: parquet-cpp > [C++] Add bloom filter utility cl

[jira] [Commented] (PARQUET-1342) Add bloom filter utility class

2018-06-29 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527315#comment-16527315 ] Junjie Chen commented on PARQUET-1342: -- PR: https://github.com/apache/parquet-mr/pull/425 >

[jira] [Updated] (PARQUET-1326) [C++] Cross compatibility support with parquet-mr

2018-06-29 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen updated PARQUET-1326: - Summary: [C++] Cross compatibility support with parquet-mr (was: parquet bloom filter support

[jira] [Comment Edited] (PARQUET-1332) [C++] Add bloom filter utility class

2018-06-29 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519914#comment-16519914 ] Junjie Chen edited comment on PARQUET-1332 at 6/29/18 8:28 AM: --- PR

[jira] [Updated] (PARQUET-1332) [C++] Add bloom filter utility class

2018-06-29 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen updated PARQUET-1332: - Summary: [C++] Add bloom filter utility class (was: Add bloom filter utility class) >

[jira] [Updated] (PARQUET-1332) Add bloom filter utility class

2018-06-29 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen updated PARQUET-1332: - Fix Version/s: (was: 1.10.0) 1.11.0 > Add bloom filter utility cl

[jira] [Commented] (PARQUET-1332) Add bloom filter utility class

2018-06-21 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519914#comment-16519914 ] Junjie Chen commented on PARQUET-1332: -- PR for parquet-mr: https://github.com/apache/parquet-mr

[jira] [Created] (PARQUET-1332) Add bloom filter utility class

2018-06-21 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1332: Summary: Add bloom filter utility class Key: PARQUET-1332 URL: https://issues.apache.org/jira/browse/PARQUET-1332 Project: Parquet Issue Type: Sub-task

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-06-19 Thread Junjie Chen (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517638#comment-16517638 ] Junjie Chen commented on PARQUET-41: [~jbapple], I just created a new parquet-format PR since

[jira] [Created] (PARQUET-1329) integrate parquet bloom filter into row group filter logic

2018-06-15 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1329: Summary: integrate parquet bloom filter into row group filter logic Key: PARQUET-1329 URL: https://issues.apache.org/jira/browse/PARQUET-1329 Project: Parquet

[jira] [Created] (PARQUET-1328) parquet bloom filter writer implementation

2018-06-15 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1328: Summary: parquet bloom filter writer implementation Key: PARQUET-1328 URL: https://issues.apache.org/jira/browse/PARQUET-1328 Project: Parquet Issue Type

[jira] [Created] (PARQUET-1327) parquet bloom filter reader implementation

2018-06-15 Thread Junjie Chen (JIRA)
Junjie Chen created PARQUET-1327: Summary: parquet bloom filter reader implementation Key: PARQUET-1327 URL: https://issues.apache.org/jira/browse/PARQUET-1327 Project: Parquet Issue Type

  1   2   >