Thanks. I tried this.
val projection: Seq[column.ColumnDescriptor] = //filter the columns I
want from the schema
val projectionBuilder = Types.buildMessage()
for (col <- projection) {
projectionBuilder.addField(Types.buildMessage().named(col.getPath.head))
}
OK sorry for all the messages but I have this working now:
On 4/13/18, 12:59 PM, "Andy Grove" wrote:
Immediately after sending this I realized that I also needed to pass the
projection message type in the following lines:
val columnIO = new
Hi,
I’m trying to read a parquet file with a projection from Scala and I can’t find
docs or examples for the correct way to do this.
I have the file schema and have filtered for the list of columns I need, so I
have a List of ColumnDescriptors.
It looks like I should call
Hi Ryan,
I'm writing some low-level performance tests to try and find a bottleneck on
our platform and have intentionally excluded Spark/Thrift/Presto etc and want
to test Parquet directly both with local files and against our HDFS cluster to
get performance metrics. Our parquet files were
Andy, what object model are you using to read? Usually you don't have a
list of column descriptors, you have an Avro read schema or a Thrift class
or something.
On Fri, Apr 13, 2018 at 10:31 AM, Andy Grove wrote:
> Hi,
>
> I’m trying to read a parquet file with a projection
I'd suggest using the Types builders to create your projection schema
(MessageType), then passing that schema to the
ParquetFileReader.setRequestedSchema method you found.
On Fri, Apr 13, 2018 at 10:40 AM, Andy Grove wrote:
> Hi Ryan,
>
> I'm writing some low-level
Immediately after sending this I realized that I also needed to pass the
projection message type in the following lines:
val columnIO = new ColumnIOFactory().getColumnIO(projectionType)
val recordReader = columnIO.getRecordReader(pages, new
GroupRecordConverter(projectionType))
I
[
https://issues.apache.org/jira/browse/PARQUET-1244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Antoine Pitrou updated PARQUET-1244:
Labels: beginner (was: )
> Documentation link to logical types broken
>
Antoine Pitrou created PARQUET-1269:
---
Summary: [C++] Scanning fails with list columns
Key: PARQUET-1269
URL: https://issues.apache.org/jira/browse/PARQUET-1269
Project: Parquet
Issue Type:
Antoine Pitrou created PARQUET-1270:
---
Summary: [C++] Executable tools do not get installed
Key: PARQUET-1270
URL: https://issues.apache.org/jira/browse/PARQUET-1270
Project: Parquet
Issue
[
https://issues.apache.org/jira/browse/PARQUET-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16437192#comment-16437192
]
ASF GitHub Bot commented on PARQUET-1270:
-
pitrou opened a new pull request #455: PARQUET-1270:
Antoine Pitrou created PARQUET-1271:
---
Summary: [C++] "parquet_reader" should be "parquet-reader"
Key: PARQUET-1271
URL: https://issues.apache.org/jira/browse/PARQUET-1271
Project: Parquet
[
https://issues.apache.org/jira/browse/PARQUET-968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16437369#comment-16437369
]
ASF GitHub Bot commented on PARQUET-968:
costimuraru commented on issue #411: PARQUET-968 Add
[
https://issues.apache.org/jira/browse/PARQUET-968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16437294#comment-16437294
]
ASF GitHub Bot commented on PARQUET-968:
costimuraru commented on issue #411: PARQUET-968 Add
14 matches
Mail list logo