[jira] [Updated] (PARQUET-520) Add version of LocalFileSource that uses memory-mapping for zero-copy reads

2016-02-22 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/PARQUET-520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated PARQUET-520:
-
Description: Repurposed this JIRA after PARQUET-533. Memory-mapping will 
save us memory allocations and performance in some applications. We could have 
it as an optional API.   (was: Noted this while working on PARQUET-497. If we 
are using a memory-mapped file, then copying data into a 
{{ScopedInMemoryInputStream}} as we are now is unnecessary and will yield 
improved performance. Perhaps this should be made a property of the 
{{InputStream}} (i.e. indicate whether it support zero-copy reads, and the 
returned buffer does not become invalid after future reads as long as the 
stream -- the memory map specifically in this example -- is alive))
Summary: Add version of LocalFileSource that uses memory-mapping for 
zero-copy reads  (was: Add support for zero-copy InputStreams on memory-mapped 
files)

> Add version of LocalFileSource that uses memory-mapping for zero-copy reads
> ---
>
> Key: PARQUET-520
> URL: https://issues.apache.org/jira/browse/PARQUET-520
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-cpp
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>
> Repurposed this JIRA after PARQUET-533. Memory-mapping will save us memory 
> allocations and performance in some applications. We could have it as an 
> optional API. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (PARQUET-525) Test coverage for malformed file failure modes on the read path

2016-02-22 Thread Julien Le Dem (JIRA)

 [ 
https://issues.apache.org/jira/browse/PARQUET-525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Le Dem resolved PARQUET-525.
---
   Resolution: Fixed
Fix Version/s: cpp-0.1

Issue resolved by pull request 60
[https://github.com/apache/parquet-cpp/pull/60]

> Test coverage for malformed file failure modes on the read path
> ---
>
> Key: PARQUET-525
> URL: https://issues.apache.org/jira/browse/PARQUET-525
> Project: Parquet
>  Issue Type: Test
>  Components: parquet-cpp
>Reporter: Wes McKinney
> Fix For: cpp-0.1
>
>
> These code paths do not have test coverage. We should construct test cases 
> that each possible kind of malformation. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (PARQUET-543) Remove BoundedInt encodings

2016-02-22 Thread Ryan Blue (JIRA)

 [ 
https://issues.apache.org/jira/browse/PARQUET-543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan Blue reassigned PARQUET-543:
-

Assignee: Ryan Blue

> Remove BoundedInt encodings
> ---
>
> Key: PARQUET-543
> URL: https://issues.apache.org/jira/browse/PARQUET-543
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-mr
>Affects Versions: 1.8.1
>Reporter: Ryan Blue
>Assignee: Ryan Blue
>
> The classes in org.apache.parquet.column.values.boundedint aren't used. It 
> looks like this was intended to be the "right" way to use the RLE/BitPacking 
> hybrid, but callers ended up instantiating the RLE encoder or writer directly.
> The ZeroIntegerValuesReader and DevNullValuesWriter are used, but should be 
> relocated. The ZeroIntegerValuesReader is only used when the encoding is RLE 
> (in 
> [Encoding.java|https://github.com/apache/parquet-mr/blob/master/parquet-column/src/main/java/org/apache/parquet/column/Encoding.java#L119])
>  and the DevNullValuesWriter actually writes BIT_PACKED values. It would be 
> better to relocate those classes in the rle and bitpacking packages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PARQUET-543) Remove BoundedInt encodings

2016-02-22 Thread Ryan Blue (JIRA)
Ryan Blue created PARQUET-543:
-

 Summary: Remove BoundedInt encodings
 Key: PARQUET-543
 URL: https://issues.apache.org/jira/browse/PARQUET-543
 Project: Parquet
  Issue Type: Improvement
  Components: parquet-mr
Affects Versions: 1.8.1
Reporter: Ryan Blue


The classes in org.apache.parquet.column.values.boundedint aren't used. It 
looks like this was intended to be the "right" way to use the RLE/BitPacking 
hybrid, but callers ended up instantiating the RLE encoder or writer directly.

The ZeroIntegerValuesReader and DevNullValuesWriter are used, but should be 
relocated. The ZeroIntegerValuesReader is only used when the encoding is RLE 
(in 
[Encoding.java|https://github.com/apache/parquet-mr/blob/master/parquet-column/src/main/java/org/apache/parquet/column/Encoding.java#L119])
 and the DevNullValuesWriter actually writes BIT_PACKED values. It would be 
better to relocate those classes in the rle and bitpacking packages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)