hamza-tam opened a new pull request, #1041:
URL: https://github.com/apache/parquet-mr/pull/1041
Make sure you have checked _all_ steps below.
### Jira
- [ ] My PR addresses the following [Parquet
Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references
them
[
https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701921#comment-17701921
]
ASF GitHub Bot commented on PARQUET-:
-
pitrou commented on PR #193:
URL: ht
[
https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701922#comment-17701922
]
ASF GitHub Bot commented on PARQUET-:
-
pitrou commented on PR #193:
URL: ht
pitrou commented on PR #193:
URL: https://github.com/apache/parquet-format/pull/193#issuecomment-1474188220
Another possibility is a nice table:
```
+--++-+
| Page kind| RLE-encoded data kind | Prepend length? |
+---
pitrou commented on PR #193:
URL: https://github.com/apache/parquet-format/pull/193#issuecomment-1474188853
Also, please someone with better knowledge of parquet-mr comment on
[https://github.com/apache/parquet-format/pull/193#issuecomment-1474171946].
--
This is an automated message from
[
https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701920#comment-17701920
]
ASF GitHub Bot commented on PARQUET-:
-
pitrou commented on PR #193:
URL: ht
pitrou commented on PR #193:
URL: https://github.com/apache/parquet-format/pull/193#issuecomment-1474186504
I think this should be more explicit, e.g.:
```
// The length-prepended version is used for:
// - in v1 data pages: definition levels, repetition levels, and RLE-encoded
boole
[
https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701914#comment-17701914
]
ASF GitHub Bot commented on PARQUET-:
-
pitrou commented on PR #193:
URL: ht
pitrou commented on PR #193:
URL: https://github.com/apache/parquet-format/pull/193#issuecomment-1474171946
Ok, so v2 data pages for RLE-encoded boolean do encode the length:
https://github.com/apache/parquet-mr/blob/1235003e742e6a76bf6cb8f7ed33e942fa12d0d5/parquet-column/src/main/java/or
[
https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701907#comment-17701907
]
ASF GitHub Bot commented on PARQUET-:
-
pitrou commented on PR #193:
URL: ht
pitrou commented on PR #193:
URL: https://github.com/apache/parquet-format/pull/193#issuecomment-1474154058
I was alluding to this comment:
> @wgtmac is correct that `length` is left out only in case of `v2` `DL` and
`RL`.
--
This is an automated message from the Apache Git Ser
[
https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701904#comment-17701904
]
ASF GitHub Bot commented on PARQUET-:
-
mapleFU commented on PR #193:
URL: h
mapleFU commented on PR #193:
URL: https://github.com/apache/parquet-format/pull/193#issuecomment-1474140277
@pitrou I guess it's already within DataPage. It's the first 4B in
deserialized data
--
This is an automated message from the Apache Git Service.
To respond to the message, please
[
https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701880#comment-17701880
]
ASF GitHub Bot commented on PARQUET-:
-
pitrou commented on PR #193:
URL: ht
pitrou commented on PR #193:
URL: https://github.com/apache/parquet-format/pull/193#issuecomment-1474072383
Hmm, can you point me to the place where the length is written out for
RLE-encoded boolean data in v2 data pages?
--
This is an automated message from the Apache Git Service.
To res
gszadovszky commented on PR #31:
URL: https://github.com/apache/parquet-site/pull/31#issuecomment-1474023977
@wgtmac, I've found it finally: https://parquet.staged.apache.org/
I don't think staging makes sense this way. The two branches are already
diverged from each other. I think a bett
wgtmac commented on PR #31:
URL: https://github.com/apache/parquet-site/pull/31#issuecomment-1473954586
There is a
[document](https://github.com/apache/parquet-site/tree/production#staging) for
staging and production but I still don't know where is the staging site.
OK, I will close
[
https://issues.apache.org/jira/browse/PARQUET-2256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701813#comment-17701813
]
ASF GitHub Bot commented on PARQUET-2256:
-
wgtmac commented on PR #195:
URL: ht
wgtmac commented on PR #195:
URL: https://github.com/apache/parquet-format/pull/195#issuecomment-1473943114
@gszadovszky Good point!
I have a relevant proposal
(https://github.com/apache/parquet-format/pull/194) to bloom filter, mind take
a look as well?
--
This is an automated m
[
https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701808#comment-17701808
]
ASF GitHub Bot commented on PARQUET-:
-
wgtmac commented on PR #193:
URL: ht
wgtmac commented on PR #193:
URL: https://github.com/apache/parquet-format/pull/193#issuecomment-1473932634
Sounds good. Let me fix it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specifi
[
https://issues.apache.org/jira/browse/PARQUET-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701789#comment-17701789
]
Xinli Shang commented on PARQUET-1690:
--
It is a quite long time ago. I don't remem
[
https://issues.apache.org/jira/browse/PARQUET-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701775#comment-17701775
]
ASF GitHub Bot commented on PARQUET-2198:
-
shangxinli commented on PR #1005:
UR
shangxinli commented on PR #1005:
URL: https://github.com/apache/parquet-mr/pull/1005#issuecomment-1473894681
We already started working on the release. Please wait...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use th
[
https://issues.apache.org/jira/browse/PARQUET-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701682#comment-17701682
]
ASF GitHub Bot commented on PARQUET-2198:
-
mdadil-dk commented on PR #1005:
URL
mdadil-dk commented on PR #1005:
URL: https://github.com/apache/parquet-mr/pull/1005#issuecomment-1473738111
Any new release plan for this ?? Or have SNAPSHOT/RC build to test ??
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
[
https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701660#comment-17701660
]
ASF GitHub Bot commented on PARQUET-:
-
gszadovszky commented on PR #193:
UR
gszadovszky commented on PR #193:
URL: https://github.com/apache/parquet-format/pull/193#issuecomment-1473665965
@wgtmac is correct that `length` is left out only in case of `v2` `DL` and
`RL`. Meanwhile I agree with @pitrou that the note is better to be mentioned at
the grammar spec:
``
[
https://issues.apache.org/jira/browse/PARQUET-2256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701637#comment-17701637
]
ASF GitHub Bot commented on PARQUET-2256:
-
gszadovszky commented on PR #195:
UR
gszadovszky commented on PR #195:
URL: https://github.com/apache/parquet-format/pull/195#issuecomment-1473613246
@mapleFU, I have discovered two unfortunate issues with the format
definition of bloom filters that would be nice to be corrected before adding
this change. (I am also fine solvi
gszadovszky commented on PR #31:
URL: https://github.com/apache/parquet-site/pull/31#issuecomment-1473548025
Even though the last release is old without the release no one should
implement the new features. It should work similarly to the releases of
implementations like `parquet-mr`. So I
[
https://issues.apache.org/jira/browse/PARQUET-2256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701577#comment-17701577
]
Xuwei Fu commented on PARQUET-2256:
---
[~gszadovszky] Yes, I'd like to. I think having
[
https://issues.apache.org/jira/browse/PARQUET-2256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701575#comment-17701575
]
Gabor Szadovszky commented on PARQUET-2256:
---
[~mwish], would you mind to do s
[
https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701573#comment-17701573
]
ASF GitHub Bot commented on PARQUET-:
-
wgtmac commented on PR #193:
URL: ht
wgtmac commented on PR #193:
URL: https://github.com/apache/parquet-format/pull/193#issuecomment-1473382293
Gentle ping @gszadovszky
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
[
https://issues.apache.org/jira/browse/PARQUET-2256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gabor Szadovszky reassigned PARQUET-2256:
-
Assignee: Xuwei Fu
> Adding Compression for BloomFilter
>
wgtmac commented on PR #31:
URL: https://github.com/apache/parquet-site/pull/31#issuecomment-1473373099
> Have you copied from master or from the latest release? (I think, the
latest release would be preferred.)
I copied from `master` because the latest v2.9.0 was released almost two
yea
gszadovszky commented on PR #31:
URL: https://github.com/apache/parquet-site/pull/31#issuecomment-1473359437
Thanks for taking care of this, @wgtmac!
Have you copied from `master` or from the latest release? (I think, the
latest release would be preferred.)
Also, what do you think abou
[
https://issues.apache.org/jira/browse/PARQUET-2258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701568#comment-17701568
]
Gabor Szadovszky commented on PARQUET-2258:
---
Thanks for fixing this, [~abstra
[
https://issues.apache.org/jira/browse/PARQUET-2256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701567#comment-17701567
]
ASF GitHub Bot commented on PARQUET-2256:
-
mapleFU commented on PR #195:
URL: h
mapleFU commented on PR #195:
URL: https://github.com/apache/parquet-format/pull/195#issuecomment-1473351026
@gszadovszky Mind take a look?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spe
[
https://issues.apache.org/jira/browse/PARQUET-2256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701566#comment-17701566
]
ASF GitHub Bot commented on PARQUET-2256:
-
mapleFU opened a new pull request, #
mapleFU opened a new pull request, #195:
URL: https://github.com/apache/parquet-format/pull/195
Make sure you have checked _all_ steps below.
### Jira
- [x] My PR addresses the following [Parquet
Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references
them
[
https://issues.apache.org/jira/browse/PARQUET-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701561#comment-17701561
]
Gabor Szadovszky commented on PARQUET-1690:
---
[~humanoid], I don't know/rememb
[
https://issues.apache.org/jira/browse/PARQUET-2259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated PARQUET-2259:
Labels: pull-request-available (was: )
> [Site] Update parquet site
> --
wgtmac commented on PR #31:
URL: https://github.com/apache/parquet-site/pull/31#issuecomment-1473291804
I have copied corresponding text from parquet-format to make it easy to
update in the future. Please take a look, thanks! @gszadovszky @shangxinli
--
This is an automated message from
[
https://issues.apache.org/jira/browse/PARQUET-2258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701543#comment-17701543
]
László Bodor commented on PARQUET-2258:
---
thanks [~gszadovszky] and [~wgtmac] for
[
https://issues.apache.org/jira/browse/PARQUET-2258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
László Bodor resolved PARQUET-2258.
---
Resolution: Fixed
> Storing toString fields in FilterPredicate instances can lead to memory
[
https://issues.apache.org/jira/browse/PARQUET-2258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
László Bodor updated PARQUET-2258:
--
Fix Version/s: 1.12.3
> Storing toString fields in FilterPredicate instances can lead to memo
[
https://issues.apache.org/jira/browse/PARQUET-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701541#comment-17701541
]
Alexey Diomin edited comment on PARQUET-1690 at 3/17/23 7:17 AM:
[
https://issues.apache.org/jira/browse/PARQUET-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701541#comment-17701541
]
Alexey Diomin commented on PARQUET-1690:
[~gszadovszky] could you review the l
51 matches
Mail list logo