gszadovszky commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1316059558
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,74 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
emkornfield commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1313882045
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,74 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
dependabot[bot] commented on PR #205:
URL: https://github.com/apache/parquet-format/pull/205#issuecomment-1704312502
Superseded by #213.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
dependabot[bot] closed pull request #205: Bump libthrift from 0.16.0 to 0.18.1
URL: https://github.com/apache/parquet-format/pull/205
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
dependabot[bot] opened a new pull request, #213:
URL: https://github.com/apache/parquet-format/pull/213
Bumps [org.apache.thrift:libthrift](https://github.com/apache/thrift) from
0.16.0 to 0.19.0.
Release notes
Sourced from
etseidl commented on PR #197:
URL: https://github.com/apache/parquet-format/pull/197#issuecomment-1707166521
> I think we can now move to the simpler option of just putting
SizeStatistics on Column Index to consolidate everything? I would guess this
would also make implementations simpler.
emkornfield commented on PR #197:
URL: https://github.com/apache/parquet-format/pull/197#issuecomment-1707150050
Based on
https://github.com/apache/parquet-format/pull/197#discussion_r1316059558 I
think we can now move to the simpler option of just putting SizeStatistics on
Column Index
pitrou commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1314311470
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,74 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
Fokko opened a new pull request, #214:
URL: https://github.com/apache/parquet-format/pull/214
Make sure you have checked _all_ steps below.
This should support Java 8 again
### Jira
- [ ] My PR addresses the following [Parquet
wgtmac opened a new pull request, #1137:
URL: https://github.com/apache/parquet-mr/pull/1137
Make sure you have checked _all_ steps below.
### Jira
- [ ] My PR addresses the following [Parquet
Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references
them in
dependabot[bot] closed pull request #213: Bump org.apache.thrift:libthrift from
0.16.0 to 0.19.0
URL: https://github.com/apache/parquet-format/pull/213
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
Fokko merged PR #209:
URL: https://github.com/apache/parquet-format/pull/209
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail:
Fokko commented on code in PR #203:
URL: https://github.com/apache/parquet-format/pull/203#discussion_r1314324512
##
.github/workflows/test.yml:
##
@@ -26,15 +26,16 @@ jobs:
strategy:
fail-fast: false
matrix:
-java: [ '1.8', '11' ]
+java: [
wgtmac commented on PR #213:
URL: https://github.com/apache/parquet-format/pull/213#issuecomment-1704567581
@dependabot close
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
wgtmac merged PR #1136:
URL: https://github.com/apache/parquet-mr/pull/1136
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail:
rdblue commented on PR #214:
URL: https://github.com/apache/parquet-format/pull/214#issuecomment-1705491009
Thanks, @Fokko! Good to be unblocked for thrift updates.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
rdblue merged PR #214:
URL: https://github.com/apache/parquet-format/pull/214
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail:
wgtmac merged PR #1137:
URL: https://github.com/apache/parquet-mr/pull/1137
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail:
etseidl commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1317881962
##
src/main/thrift/parquet.thrift:
##
@@ -977,6 +1073,15 @@ struct ColumnIndex {
/** A list containing the number of null values for each page **/
5:
emkornfield commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318059860
##
src/main/thrift/parquet.thrift:
##
@@ -977,6 +1073,15 @@ struct ColumnIndex {
/** A list containing the number of null values for each page **/
5:
emkornfield commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1317866553
##
src/main/thrift/parquet.thrift:
##
@@ -977,6 +1073,15 @@ struct ColumnIndex {
/** A list containing the number of null values for each page **/
5:
wgtmac commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1317993757
##
src/main/thrift/parquet.thrift:
##
@@ -977,6 +1073,15 @@ struct ColumnIndex {
/** A list containing the number of null values for each page **/
5:
dependabot[bot] opened a new pull request, #215:
URL: https://github.com/apache/parquet-format/pull/215
Bumps org.slf4j:slf4j-api from 1.7.12 to 2.0.9.
[![Dependabot compatibility
dependabot[bot] commented on PR #204:
URL: https://github.com/apache/parquet-format/pull/204#issuecomment-1712817820
Superseded by #215.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
dependabot[bot] closed pull request #204: Bump slf4j-api from 1.7.12 to 2.0.7
URL: https://github.com/apache/parquet-format/pull/204
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
JFinis commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318275562
##
src/main/thrift/parquet.thrift:
##
@@ -977,6 +1073,15 @@ struct ColumnIndex {
/** A list containing the number of null values for each page **/
5:
tustvold commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318314149
##
src/main/thrift/parquet.thrift:
##
@@ -529,7 +596,15 @@ struct DataPageHeader {
/** Encoding used for repetition levels **/
4: required Encoding
pitrou commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318393474
##
src/main/thrift/parquet.thrift:
##
@@ -977,6 +1073,15 @@ struct ColumnIndex {
/** A list containing the number of null values for each page **/
5:
mapleFU commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318576796
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,73 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
JFinis commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318260892
##
src/main/thrift/parquet.thrift:
##
@@ -977,6 +1073,15 @@ struct ColumnIndex {
/** A list containing the number of null values for each page **/
5:
JFinis commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318273187
##
src/main/thrift/parquet.thrift:
##
@@ -977,6 +1073,15 @@ struct ColumnIndex {
/** A list containing the number of null values for each page **/
5:
mapleFU commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318347248
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,73 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
pitrou commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318413084
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,73 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
etseidl commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318190332
##
src/main/thrift/parquet.thrift:
##
@@ -977,6 +1073,15 @@ struct ColumnIndex {
/** A list containing the number of null values for each page **/
5:
pitrou commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318334927
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,73 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
tustvold commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318362226
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,73 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
tustvold commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318362226
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,73 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
tustvold commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318386752
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,73 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
mapleFU commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318413890
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,73 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
JFinis commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318275562
##
src/main/thrift/parquet.thrift:
##
@@ -977,6 +1073,15 @@ struct ColumnIndex {
/** A list containing the number of null values for each page **/
5:
JFinis commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318367540
##
src/main/thrift/parquet.thrift:
##
@@ -977,6 +1073,15 @@ struct ColumnIndex {
/** A list containing the number of null values for each page **/
5:
pitrou commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318384010
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,73 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
etseidl commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318570484
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,73 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
JFinis commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318273187
##
src/main/thrift/parquet.thrift:
##
@@ -977,6 +1073,15 @@ struct ColumnIndex {
/** A list containing the number of null values for each page **/
5:
JFinis commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318292827
##
src/main/thrift/parquet.thrift:
##
@@ -977,6 +1073,15 @@ struct ColumnIndex {
/** A list containing the number of null values for each page **/
5:
etseidl commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318565796
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,73 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
wgtmac commented on PR #215:
URL: https://github.com/apache/parquet-format/pull/215#issuecomment-1718760542
@dependabot ignore this dependency
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
dependabot[bot] closed pull request #215: PARQUET-2346: Bump
org.slf4j:slf4j-api from 1.7.12 to 2.0.9
URL: https://github.com/apache/parquet-format/pull/215
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
dependabot[bot] commented on PR #215:
URL: https://github.com/apache/parquet-format/pull/215#issuecomment-1718760575
OK, I won't notify you about org.slf4j:slf4j-api again, unless you re-open
this PR.
--
This is an automated message from the Apache Git Service.
To respond to the message,
wwang-talend opened a new pull request, #1140:
URL: https://github.com/apache/parquet-mr/pull/1140
Make sure you have checked _all_ steps below.
### Jira
- [ ] My PR addresses the following [Parquet
Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references
amousavigourabi opened a new pull request, #1141:
URL: https://github.com/apache/parquet-mr/pull/1141
Make sure you have checked _all_ steps below.
### Jira
- [x] My PR addresses the following [Parquet
Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references
zhangjiashen opened a new pull request, #1142:
URL: https://github.com/apache/parquet-mr/pull/1142
### Jira
- [ ] My PR addresses the following [Parquet
Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references
them in the PR title. For example, "PARQUET-1234: My
shangxinli commented on PR #1139:
URL: https://github.com/apache/parquet-mr/pull/1139#issuecomment-1722528243
@steveloughran Thanks a lot for creating this PR! This is an important
feature that we improve the reading performance of Parquet. I just took a brief
look and they look great! I
majdyz opened a new pull request, #1135:
URL: https://github.com/apache/parquet-mr/pull/1135
Make sure you have checked _all_ steps below.
### Jira
- [x] My PR addresses the following [Parquet
Jira](https://issues.apache.org/jira/browse/PARQUET-2342) issues and references
etseidl commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1312152587
##
src/main/thrift/parquet.thrift:
##
@@ -764,6 +845,14 @@ struct ColumnMetaData {
* in a single I/O.
*/
15: optional i32 bloom_filter_length;
+
+ /**
etseidl commented on PR #197:
URL: https://github.com/apache/parquet-format/pull/197#issuecomment-1701548513
I've implemented option 2 now. As expected, the size impact is somewhat less
due to less nesting in the thrift output. Here are some comparisson numbers
(apologies, it seems my
etseidl commented on PR #197:
URL: https://github.com/apache/parquet-format/pull/197#issuecomment-1701566823
I forgot to mention that for option 2 I added
`unencoded_variable_width_stored_bytes` to the `PageLocation` struct.
Now I think I'm leaning towards option 2. For some of my
wgtmac commented on PR #197:
URL: https://github.com/apache/parquet-format/pull/197#issuecomment-1702498342
Thanks for the quick PoC! It seems that option 2 is the best at the moment.
But option 1 has more flexibility if we intend to add more fields to
SizeStatistics.
--
This is an
emkornfield commented on PR #197:
URL: https://github.com/apache/parquet-format/pull/197#issuecomment-1703090707
> As the implemention detail, can we ignore the rep-def histogram when
max-rep <= 1, max-def <= 1? Since we already have page-ordinal in OffsetIndex
and null-count in
pitrou commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1313375622
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,74 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
emkornfield commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1313391303
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,74 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
emkornfield commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1313391714
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,74 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
pitrou commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1313376199
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,74 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
emkornfield commented on PR #197:
URL: https://github.com/apache/parquet-format/pull/197#issuecomment-1703098968
OK, pushed updates. @etseidl @mapleFU @wgtmac @pitrou @gszadovszky
hopefully we can say this is a good version to prototype implementation on?
--
This is an automated message
emkornfield commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1313329517
##
src/main/thrift/parquet.thrift:
##
@@ -764,6 +845,14 @@ struct ColumnMetaData {
* in a single I/O.
*/
15: optional i32 bloom_filter_length;
+
+
etseidl commented on PR #197:
URL: https://github.com/apache/parquet-format/pull/197#issuecomment-1703115793
> hopefully we can say this is a good version to prototype implementation on?
Looks good to me. I'll get started now.
--
This is an automated message from the Apache Git
emkornfield commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1313392696
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,74 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
mapleFU commented on PR #197:
URL: https://github.com/apache/parquet-format/pull/197#issuecomment-1703702682
Also cc @tustvoid as arrow-rs parquet maintainer
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
etseidl commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1313547575
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,74 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
emkornfield commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1313652871
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,74 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
etseidl commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1313547575
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,74 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
JFinis commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1323069846
##
src/main/thrift/parquet.thrift:
##
@@ -764,6 +810,14 @@ struct ColumnMetaData {
* in a single I/O.
*/
15: optional i32 bloom_filter_length;
+
+ /**
JFinis commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1323069846
##
src/main/thrift/parquet.thrift:
##
@@ -764,6 +810,14 @@ struct ColumnMetaData {
* in a single I/O.
*/
15: optional i32 bloom_filter_length;
+
+ /**
JFinis commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1323059489
##
src/main/thrift/parquet.thrift:
##
@@ -977,6 +1038,25 @@ struct ColumnIndex {
/** A list containing the number of null values for each page **/
5:
JFinis commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1323028211
##
src/main/thrift/parquet.thrift:
##
@@ -764,6 +810,14 @@ struct ColumnMetaData {
* in a single I/O.
*/
15: optional i32 bloom_filter_length;
+
+ /**
JFinis commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1323028211
##
src/main/thrift/parquet.thrift:
##
@@ -764,6 +810,14 @@ struct ColumnMetaData {
* in a single I/O.
*/
15: optional i32 bloom_filter_length;
+
+ /**
etseidl commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1323524900
##
src/main/thrift/parquet.thrift:
##
@@ -977,6 +1038,25 @@ struct ColumnIndex {
/** A list containing the number of null values for each page **/
5:
emkornfield commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1323505565
##
src/main/thrift/parquet.thrift:
##
@@ -764,6 +810,14 @@ struct ColumnMetaData {
* in a single I/O.
*/
15: optional i32 bloom_filter_length;
+
+
emkornfield commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1323506069
##
src/main/thrift/parquet.thrift:
##
@@ -977,6 +1038,25 @@ struct ColumnIndex {
/** A list containing the number of null values for each page **/
5:
steveloughran opened a new pull request, #1139:
URL: https://github.com/apache/parquet-mr/pull/1139
Make sure you have checked _all_ steps below.
### Jira
- [X] My PR addresses the following [Parquet
Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references
wgtmac commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1324650964
##
src/main/thrift/parquet.thrift:
##
@@ -764,6 +810,14 @@ struct ColumnMetaData {
* in a single I/O.
*/
15: optional i32 bloom_filter_length;
+
+ /**
JFinis commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318737835
##
src/main/thrift/parquet.thrift:
##
@@ -529,7 +596,15 @@ struct DataPageHeader {
/** Encoding used for repetition levels **/
4: required Encoding
etseidl commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318770227
##
src/main/thrift/parquet.thrift:
##
@@ -977,6 +1073,15 @@ struct ColumnIndex {
/** A list containing the number of null values for each page **/
5:
wgtmac commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318848187
##
src/main/thrift/parquet.thrift:
##
@@ -529,7 +596,15 @@ struct DataPageHeader {
/** Encoding used for repetition levels **/
4: required Encoding
emkornfield commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1320231941
##
src/main/thrift/parquet.thrift:
##
@@ -977,6 +1038,25 @@ struct ColumnIndex {
/** A list containing the number of null values for each page **/
5:
emkornfield commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1320143220
##
src/main/thrift/parquet.thrift:
##
@@ -977,6 +1038,25 @@ struct ColumnIndex {
/** A list containing the number of null values for each page **/
5:
etseidl commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1320256768
##
src/main/thrift/parquet.thrift:
##
@@ -977,6 +1038,25 @@ struct ColumnIndex {
/** A list containing the number of null values for each page **/
5:
etseidl commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1320192142
##
src/main/thrift/parquet.thrift:
##
@@ -977,6 +1038,25 @@ struct ColumnIndex {
/** A list containing the number of null values for each page **/
5:
emkornfield commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1319135598
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,73 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
emkornfield commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318930925
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,73 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
emkornfield commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1319130288
##
src/main/thrift/parquet.thrift:
##
@@ -977,6 +1073,15 @@ struct ColumnIndex {
/** A list containing the number of null values for each page **/
5:
emkornfield commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1319130534
##
src/main/thrift/parquet.thrift:
##
@@ -583,7 +659,12 @@ struct DataPageHeaderV2 {
If missing it is considered compressed */
7: optional bool
emkornfield commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1319135598
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,73 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
etseidl commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1319212210
##
src/main/thrift/parquet.thrift:
##
@@ -977,6 +1038,25 @@ struct ColumnIndex {
/** A list containing the number of null values for each page **/
5:
emkornfield commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1319219710
##
src/main/thrift/parquet.thrift:
##
@@ -977,6 +1038,25 @@ struct ColumnIndex {
/** A list containing the number of null values for each page **/
5:
emkornfield commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1319135598
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,73 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
wgtmac commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1319252134
##
src/main/thrift/parquet.thrift:
##
@@ -977,6 +1038,25 @@ struct ColumnIndex {
/** A list containing the number of null values for each page **/
5:
emkornfield commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318930925
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,73 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
tustvold commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318957040
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,73 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
pitrou commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1319010418
##
src/main/thrift/parquet.thrift:
##
@@ -191,6 +191,73 @@ enum FieldRepetitionType {
REPEATED = 2;
}
+/**
+ * A histogram of repetition and definition
1 - 100 of 2190 matches
Mail list logo