date:20180328

[jira] [Commented] (PARQUET-968) Add Hive/Presto support in ProtoParquet

2018-03-28 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/PARQUET-968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418262#comment-16418262
 ] 

ASF GitHub Bot commented on PARQUET-968:


rdblue commented on issue #411: PARQUET-968 Add Hive/Presto support in 
ProtoParquet
URL: https://github.com/apache/parquet-mr/pull/411#issuecomment-377076348
 
 
   These changes sound good to me once there is consensus from everyone that 
understands the protobuf parts.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Hive/Presto support in ProtoParquet
> ---
>
> Key: PARQUET-968
> URL: https://issues.apache.org/jira/browse/PARQUET-968
> Project: Parquet
>  Issue Type: Task
>Reporter: Constantin Muraru
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (PARQUET-1259) Parquet-protobuf support both protobuf 2 and protobuf 3

2018-03-28 Thread Benoit Hanotte (JIRA)


[ 
https://issues.apache.org/jira/browse/PARQUET-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1641#comment-1641
 ] 

Benoit Hanotte edited comment on PARQUET-1259 at 3/28/18 5:19 PM:
--

As discussed today during the parquet-sync, support for protobuf 2 is dropped 
in favor of protobuf 3 as the later supports both 'proto2' and 'proto3' 
syntaxes, allowing for backward compatibility. Unit tests are also covering 
both syntaxes. 
 parquet-protobuf users may have been able to continue using it with protobuf 2 
instead of 3 because the base API is somehow compatible, however we can't 
commit to supporting both as it is not tested.
 The bump to version was done in April 2017 with this commit: 
[https://github.com/apache/parquet-mr/commit/70f28810a5547219e18ffc3465f519c454fee6e5#diff-027029268e253c28f3ed7866525f3207]

Support for both versions will not be implemented since protobuf 3 offers 
backward compatiblity and supporting two implementations would be 
time-consuming.

 

Unless a major objection is raised, I believe we can close this ticket 


was (Author: b.hanotte):
As discussed today during the parquet-sync, support for protobuf 2 is dropped 
in favor of protobuf 3 as the later supports both 'proto2' and 'proto3' 
syntaxes, allowing for backward compatibility. Unit tests are also covering 
both syntaxes. 
 parquet-protobuf users may have been able to continue using it with protobuf 2 
instead of 3 because the base API is somehow compatible, however we can't 
commit to supporting both as it is not tested.
 The bump to version was done in April 2017 with this commit: 
[https://github.com/apache/parquet-mr/commit/70f28810a5547219e18ffc3465f519c454fee6e5#diff-027029268e253c28f3ed7866525f3207]

Support for both versions will not be implemented since protobuf 3 offers 
backward compatiblity and supporting two implementations would be 
time-consuming.

The version bump should be announced in the next release's notes.

 

Unless a major objection is raised, I believe we can close this ticket 

> Parquet-protobuf support both protobuf 2 and protobuf 3
> ---
>
> Key: PARQUET-1259
> URL: https://issues.apache.org/jira/browse/PARQUET-1259
> Project: Parquet
>  Issue Type: New Feature
>Affects Versions: 1.10.0, 1.9.1
>Reporter: Qinghui Xu
>Priority: Major
>
> With the merge of pull request: 
> [https://github.com/apache/parquet-mr/pull/407,] now it is protobuf 3 used in 
> parquet-protobuf, and this implies that it cannot work in an environment 
> where people are using protobuf 2 in their own dependencies because there is 
> some new API / breaking change in protobuf 3. People have to face a 
> dependency version conflict with next parquet-protobuf release (e.g. 1.9.1 or 
> 1.10.0).
> What if we support both protobuf 2 and protobuf 3 by providing 
> parquet-protobuf and parquet-protobuf2?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (PARQUET-1259) Parquet-protobuf support both protobuf 2 and protobuf 3

2018-03-28 Thread Benoit Hanotte (JIRA)


[ 
https://issues.apache.org/jira/browse/PARQUET-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1641#comment-1641
 ] 

Benoit Hanotte edited comment on PARQUET-1259 at 3/28/18 5:19 PM:
--

As discussed today during the parquet-sync, support for protobuf 2 is dropped 
in favor of protobuf 3 as the later supports both 'proto2' and 'proto3' 
syntaxes, allowing for backward compatibility. Unit tests are also covering 
both syntaxes. 
 parquet-protobuf users may have been able to continue using it with protobuf 2 
instead of 3 because the base API is somehow compatible, however we can't 
commit to supporting both as it is not tested.
 The bump to version was done in April 2017 with this commit: 
[https://github.com/apache/parquet-mr/commit/70f28810a5547219e18ffc3465f519c454fee6e5#diff-027029268e253c28f3ed7866525f3207]

Support for both versions will not be implemented since protobuf 3 offers 
backward compatiblity and supporting two implementations would be 
time-consuming.

The version bump should be announced in the next release's notes.

 

Unless a major objection is raised, I believe we can close this ticket 


was (Author: b.hanotte):
As discussed today during the parquet-sync, support for protobuf 2 is dropped 
in favor of protobuf 3 as the later supports both 'proto2' and 'proto3' 
syntaxes, allowing for backward compatibility. Unit tests are also covering 
both syntaxes. 
parquet-protobuf may have been able to continue using it with protobuf 2 
instead of 3 because the base API are somehow compatible, however we can't 
commit to supporting both as it is not tested.
The support has been dropped in April 2017 with this commit: 
[https://github.com/apache/parquet-mr/commit/70f28810a5547219e18ffc3465f519c454fee6e5#diff-027029268e253c28f3ed7866525f3207]

Support for both versions will not be implemented since protobuf 3 offers 
backward compatiblity and supporting two implementations would be 
time-consuming.

The version bump should be announced in the next release's notes.

 

Unless a major objection is raised, I believe we can close this ticket 

> Parquet-protobuf support both protobuf 2 and protobuf 3
> ---
>
> Key: PARQUET-1259
> URL: https://issues.apache.org/jira/browse/PARQUET-1259
> Project: Parquet
>  Issue Type: New Feature
>Affects Versions: 1.10.0, 1.9.1
>Reporter: Qinghui Xu
>Priority: Major
>
> With the merge of pull request: 
> [https://github.com/apache/parquet-mr/pull/407,] now it is protobuf 3 used in 
> parquet-protobuf, and this implies that it cannot work in an environment 
> where people are using protobuf 2 in their own dependencies because there is 
> some new API / breaking change in protobuf 3. People have to face a 
> dependency version conflict with next parquet-protobuf release (e.g. 1.9.1 or 
> 1.10.0).
> What if we support both protobuf 2 and protobuf 3 by providing 
> parquet-protobuf and parquet-protobuf2?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PARQUET-1259) Parquet-protobuf support both protobuf 2 and protobuf 3

2018-03-28 Thread Benoit Hanotte (JIRA)


[ 
https://issues.apache.org/jira/browse/PARQUET-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1641#comment-1641
 ] 

Benoit Hanotte commented on PARQUET-1259:
-

As discussed today during the parquet-sync, support for protobuf 2 is dropped 
in favor of protobuf 3 as the later supports both 'proto2' and 'proto3' 
syntaxes, allowing for backward compatibility. Unit tests are also covering 
both syntaxes. 
parquet-protobuf may have been able to continue using it with protobuf 2 
instead of 3 because the base API are somehow compatible, however we can't 
commit to supporting both as it is not tested.
The support has been dropped in April 2017 with this commit: 
[https://github.com/apache/parquet-mr/commit/70f28810a5547219e18ffc3465f519c454fee6e5#diff-027029268e253c28f3ed7866525f3207]

Support for both versions will not be implemented since protobuf 3 offers 
backward compatiblity and supporting two implementations would be 
time-consuming.

The version bump should be announced in the next release's notes.

 

Unless a major objection is raised, I believe we can close this ticket 

> Parquet-protobuf support both protobuf 2 and protobuf 3
> ---
>
> Key: PARQUET-1259
> URL: https://issues.apache.org/jira/browse/PARQUET-1259
> Project: Parquet
>  Issue Type: New Feature
>Affects Versions: 1.10.0, 1.9.1
>Reporter: Qinghui Xu
>Priority: Major
>
> With the merge of pull request: 
> [https://github.com/apache/parquet-mr/pull/407,] now it is protobuf 3 used in 
> parquet-protobuf, and this implies that it cannot work in an environment 
> where people are using protobuf 2 in their own dependencies because there is 
> some new API / breaking change in protobuf 3. People have to face a 
> dependency version conflict with next parquet-protobuf release (e.g. 1.9.1 or 
> 1.10.0).
> What if we support both protobuf 2 and protobuf 3 by providing 
> parquet-protobuf and parquet-protobuf2?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (PARQUET-1255) [C++] Exceptions thrown in some tests

2018-03-28 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/PARQUET-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn reassigned PARQUET-1255:


Assignee: Antoine Pitrou

> [C++] Exceptions thrown in some tests
> -
>
> Key: PARQUET-1255
> URL: https://issues.apache.org/jira/browse/PARQUET-1255
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-cpp
>Reporter: Antoine Pitrou
>Assignee: Antoine Pitrou
>Priority: Major
> Fix For: cpp-1.5.0
>
>
> Some tests (not all) throw a basic_string exception. Example:
> {code}
> $ ./debug/reader-test 
> Running main() from gtest_main.cc
> [==] Running 11 tests from 4 test cases.
> [--] Global test environment set-up.
> [--] 7 tests from TestAllTypesPlain
> [ RUN  ] TestAllTypesPlain.NoopConstructDestruct
> unknown file: Failure
> C++ exception with description "basic_string::_S_construct null not valid" 
> thrown in SetUp().
> [  FAILED  ] TestAllTypesPlain.NoopConstructDestruct (0 ms)
> [ RUN  ] TestAllTypesPlain.TestBatchRead
> unknown file: Failure
> C++ exception with description "basic_string::_S_construct null not valid" 
> thrown in SetUp().
> [  FAILED  ] TestAllTypesPlain.TestBatchRead (0 ms)
> [ RUN  ] TestAllTypesPlain.TestFlatScannerInt32
> unknown file: Failure
> C++ exception with description "basic_string::_S_construct null not valid" 
> thrown in SetUp().
> [  FAILED  ] TestAllTypesPlain.TestFlatScannerInt32 (0 ms)
> [ RUN  ] TestAllTypesPlain.TestSetScannerBatchSize
> unknown file: Failure
> C++ exception with description "basic_string::_S_construct null not valid" 
> thrown in SetUp().
> [  FAILED  ] TestAllTypesPlain.TestSetScannerBatchSize (0 ms)
> [ RUN  ] TestAllTypesPlain.DebugPrintWorks
> unknown file: Failure
> C++ exception with description "basic_string::_S_construct null not valid" 
> thrown in SetUp().
> [  FAILED  ] TestAllTypesPlain.DebugPrintWorks (0 ms)
> [ RUN  ] TestAllTypesPlain.ColumnSelection
> unknown file: Failure
> C++ exception with description "basic_string::_S_construct null not valid" 
> thrown in SetUp().
> [  FAILED  ] TestAllTypesPlain.ColumnSelection (0 ms)
> [ RUN  ] TestAllTypesPlain.ColumnSelectionOutOfRange
> unknown file: Failure
> C++ exception with description "basic_string::_S_construct null not valid" 
> thrown in SetUp().
> [  FAILED  ] TestAllTypesPlain.ColumnSelectionOutOfRange (0 ms)
> [--] 7 tests from TestAllTypesPlain (0 ms total)
> [--] 2 tests from TestLocalFile
> [ RUN  ] TestLocalFile.FileClosedOnDestruction
> unknown file: Failure
> C++ exception with description "basic_string::_S_construct null not valid" 
> thrown in SetUp().
> [  FAILED  ] TestLocalFile.FileClosedOnDestruction (0 ms)
> [ RUN  ] TestLocalFile.OpenWithMetadata
> unknown file: Failure
> C++ exception with description "basic_string::_S_construct null not valid" 
> thrown in SetUp().
> [  FAILED  ] TestLocalFile.OpenWithMetadata (0 ms)
> [--] 2 tests from TestLocalFile (0 ms total)
> [--] 1 test from TestFileReaderAdHoc
> [ RUN  ] TestFileReaderAdHoc.NationDictTruncatedDataPage
> unknown file: Failure
> C++ exception with description "basic_string::_S_construct null not valid" 
> thrown in the test body.
> [  FAILED  ] TestFileReaderAdHoc.NationDictTruncatedDataPage (1 ms)
> [--] 1 test from TestFileReaderAdHoc (1 ms total)
> [--] 1 test from TestJSONWithLocalFile
> [ RUN  ] TestJSONWithLocalFile.JSONOutput
> unknown file: Failure
> C++ exception with description "basic_string::_S_construct null not valid" 
> thrown in the test body.
> [  FAILED  ] TestJSONWithLocalFile.JSONOutput (0 ms)
> [--] 1 test from TestJSONWithLocalFile (0 ms total)
> [--] Global test environment tear-down
> [==] 11 tests from 4 test cases ran. (1 ms total)
> [  PASSED  ] 0 tests.
> [  FAILED  ] 11 tests, listed below:
> [  FAILED  ] TestAllTypesPlain.NoopConstructDestruct
> [  FAILED  ] TestAllTypesPlain.TestBatchRead
> [  FAILED  ] TestAllTypesPlain.TestFlatScannerInt32
> [  FAILED  ] TestAllTypesPlain.TestSetScannerBatchSize
> [  FAILED  ] TestAllTypesPlain.DebugPrintWorks
> [  FAILED  ] TestAllTypesPlain.ColumnSelection
> [  FAILED  ] TestAllTypesPlain.ColumnSelectionOutOfRange
> [  FAILED  ] TestLocalFile.FileClosedOnDestruction
> [  FAILED  ] TestLocalFile.OpenWithMetadata
> [  FAILED  ] TestFileReaderAdHoc.NationDictTruncatedDataPage
> [  FAILED  ] TestJSONWithLocalFile.JSONOutput
> 11 FAILED TESTS
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PARQUET-1255) [C++] Exceptions thrown in some tests

2018-03-28 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/PARQUET-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417679#comment-16417679
 ] 

ASF GitHub Bot commented on PARQUET-1255:
-

xhochy closed pull request #448: PARQUET-1255: Fix error message when 
PARQUET_TEST_DATA isn't defined
URL: https://github.com/apache/parquet-cpp/pull/448
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/src/parquet/arrow/arrow-reader-writer-test.cc 
b/src/parquet/arrow/arrow-reader-writer-test.cc
index 8051ff17..dae2e8f9 100644
--- a/src/parquet/arrow/arrow-reader-writer-test.cc
+++ b/src/parquet/arrow/arrow-reader-writer-test.cc
@@ -39,6 +39,8 @@
 
 #include "parquet/file_writer.h"
 
+#include "parquet/util/test-common.h"
+
 #include "arrow/api.h"
 #include "arrow/test-util.h"
 #include "arrow/type_traits.h"
@@ -2098,8 +2100,7 @@ TEST(TestImpalaConversion, NanosecondToImpala) {
 
 TEST(TestArrowReaderAdHoc, Int96BadMemoryAccess) {
   // PARQUET-995
-  const char* data_dir = std::getenv("PARQUET_TEST_DATA");
-  std::string dir_string(data_dir);
+  std::string dir_string(test::get_data_dir());
   std::stringstream ss;
   ss << dir_string << "/"
  << "alltypes_plain.parquet";
@@ -2119,7 +2120,7 @@ class TestArrowReaderAdHocSpark
   std::tuple>> {};
 
 TEST_P(TestArrowReaderAdHocSpark, ReadDecimals) {
-  std::string path(std::getenv("PARQUET_TEST_DATA"));
+  std::string path(test::get_data_dir());
 
   std::string filename;
   std::shared_ptr<::DataType> decimal_type;
diff --git a/src/parquet/reader-test.cc b/src/parquet/reader-test.cc
index c536fdcb..d628f472 100644
--- a/src/parquet/reader-test.cc
+++ b/src/parquet/reader-test.cc
@@ -30,6 +30,7 @@
 #include "parquet/file_reader.h"
 #include "parquet/printer.h"
 #include "parquet/util/memory.h"
+#include "parquet/util/test-common.h"
 
 using std::string;
 
@@ -37,10 +38,8 @@ namespace parquet {
 
 using ReadableFile = ::arrow::io::ReadableFile;
 
-const char* data_dir = std::getenv("PARQUET_TEST_DATA");
-
 std::string alltypes_plain() {
-  std::string dir_string(data_dir);
+  std::string dir_string(test::get_data_dir());
   std::stringstream ss;
   ss << dir_string << "/"
  << "alltypes_plain.parquet";
@@ -48,7 +47,7 @@ std::string alltypes_plain() {
 }
 
 std::string nation_dict_truncated_data_page() {
-  std::string dir_string(data_dir);
+  std::string dir_string(test::get_data_dir());
   std::stringstream ss;
   ss << dir_string << "/"
  << "nation.dict-malformed.parquet";
@@ -171,7 +170,7 @@ TEST_F(TestAllTypesPlain, ColumnSelectionOutOfRange) {
 class TestLocalFile : public ::testing::Test {
  public:
   void SetUp() {
-std::string dir_string(data_dir);
+std::string dir_string(test::get_data_dir());
 
 std::stringstream ss;
 ss << dir_string << "/"
diff --git a/src/parquet/util/test-common.h b/src/parquet/util/test-common.h
index ebf48516..22b748e7 100644
--- a/src/parquet/util/test-common.h
+++ b/src/parquet/util/test-common.h
@@ -23,6 +23,7 @@
 #include 
 #include 
 
+#include "parquet/exception.h"
 #include "parquet/types.h"
 
 using std::vector;
@@ -35,6 +36,20 @@ typedef ::testing::Types
 ParquetTypes;
 
+class ParquetTestException : public parquet::ParquetException {
+  using ParquetException::ParquetException;
+};
+
+const char* get_data_dir() {
+  const auto result = std::getenv("PARQUET_TEST_DATA");
+  if (!result || !result[0]) {
+throw ParquetTestException(
+"Please point the PARQUET_TEST_DATA environment "
+"variable to the test data directory");
+  }
+  return result;
+}
+
 template 
 static inline void assert_vector_equal(const vector& left, const vector& 
right) {
   ASSERT_EQ(left.size(), right.size());


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [C++] Exceptions thrown in some tests
> -
>
> Key: PARQUET-1255
> URL: https://issues.apache.org/jira/browse/PARQUET-1255
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-cpp
>Reporter: Antoine Pitrou
>Priority: Major
> Fix For: cpp-1.5.0
>
>
> Some tests (not all) throw a basic_string exception. Example:
> {code}
> $ ./debug/reader-test 
> Running main() from gtest_main.cc
>

[jira] [Assigned] (PARQUET-1071) [C++] parquet::arrow::FileWriter::Close is not idempotent

2018-03-28 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/PARQUET-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn reassigned PARQUET-1071:


Assignee: Antoine Pitrou

> [C++] parquet::arrow::FileWriter::Close is not idempotent
> -
>
> Key: PARQUET-1071
> URL: https://issues.apache.org/jira/browse/PARQUET-1071
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-cpp
>Affects Versions: cpp-1.2.0
>Reporter: Wes McKinney
>Assignee: Antoine Pitrou
>Priority: Major
> Fix For: cpp-1.5.0
>
>
> Encountered a segfault when calling multiple times from Python



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (PARQUET-1255) [C++] Exceptions thrown in some tests

2018-03-28 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/PARQUET-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved PARQUET-1255.
--
   Resolution: Fixed
Fix Version/s: cpp-1.5.0

Issue resolved by pull request 448
[https://github.com/apache/parquet-cpp/pull/448]

> [C++] Exceptions thrown in some tests
> -
>
> Key: PARQUET-1255
> URL: https://issues.apache.org/jira/browse/PARQUET-1255
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-cpp
>Reporter: Antoine Pitrou
>Priority: Major
> Fix For: cpp-1.5.0
>
>
> Some tests (not all) throw a basic_string exception. Example:
> {code}
> $ ./debug/reader-test 
> Running main() from gtest_main.cc
> [==] Running 11 tests from 4 test cases.
> [--] Global test environment set-up.
> [--] 7 tests from TestAllTypesPlain
> [ RUN  ] TestAllTypesPlain.NoopConstructDestruct
> unknown file: Failure
> C++ exception with description "basic_string::_S_construct null not valid" 
> thrown in SetUp().
> [  FAILED  ] TestAllTypesPlain.NoopConstructDestruct (0 ms)
> [ RUN  ] TestAllTypesPlain.TestBatchRead
> unknown file: Failure
> C++ exception with description "basic_string::_S_construct null not valid" 
> thrown in SetUp().
> [  FAILED  ] TestAllTypesPlain.TestBatchRead (0 ms)
> [ RUN  ] TestAllTypesPlain.TestFlatScannerInt32
> unknown file: Failure
> C++ exception with description "basic_string::_S_construct null not valid" 
> thrown in SetUp().
> [  FAILED  ] TestAllTypesPlain.TestFlatScannerInt32 (0 ms)
> [ RUN  ] TestAllTypesPlain.TestSetScannerBatchSize
> unknown file: Failure
> C++ exception with description "basic_string::_S_construct null not valid" 
> thrown in SetUp().
> [  FAILED  ] TestAllTypesPlain.TestSetScannerBatchSize (0 ms)
> [ RUN  ] TestAllTypesPlain.DebugPrintWorks
> unknown file: Failure
> C++ exception with description "basic_string::_S_construct null not valid" 
> thrown in SetUp().
> [  FAILED  ] TestAllTypesPlain.DebugPrintWorks (0 ms)
> [ RUN  ] TestAllTypesPlain.ColumnSelection
> unknown file: Failure
> C++ exception with description "basic_string::_S_construct null not valid" 
> thrown in SetUp().
> [  FAILED  ] TestAllTypesPlain.ColumnSelection (0 ms)
> [ RUN  ] TestAllTypesPlain.ColumnSelectionOutOfRange
> unknown file: Failure
> C++ exception with description "basic_string::_S_construct null not valid" 
> thrown in SetUp().
> [  FAILED  ] TestAllTypesPlain.ColumnSelectionOutOfRange (0 ms)
> [--] 7 tests from TestAllTypesPlain (0 ms total)
> [--] 2 tests from TestLocalFile
> [ RUN  ] TestLocalFile.FileClosedOnDestruction
> unknown file: Failure
> C++ exception with description "basic_string::_S_construct null not valid" 
> thrown in SetUp().
> [  FAILED  ] TestLocalFile.FileClosedOnDestruction (0 ms)
> [ RUN  ] TestLocalFile.OpenWithMetadata
> unknown file: Failure
> C++ exception with description "basic_string::_S_construct null not valid" 
> thrown in SetUp().
> [  FAILED  ] TestLocalFile.OpenWithMetadata (0 ms)
> [--] 2 tests from TestLocalFile (0 ms total)
> [--] 1 test from TestFileReaderAdHoc
> [ RUN  ] TestFileReaderAdHoc.NationDictTruncatedDataPage
> unknown file: Failure
> C++ exception with description "basic_string::_S_construct null not valid" 
> thrown in the test body.
> [  FAILED  ] TestFileReaderAdHoc.NationDictTruncatedDataPage (1 ms)
> [--] 1 test from TestFileReaderAdHoc (1 ms total)
> [--] 1 test from TestJSONWithLocalFile
> [ RUN  ] TestJSONWithLocalFile.JSONOutput
> unknown file: Failure
> C++ exception with description "basic_string::_S_construct null not valid" 
> thrown in the test body.
> [  FAILED  ] TestJSONWithLocalFile.JSONOutput (0 ms)
> [--] 1 test from TestJSONWithLocalFile (0 ms total)
> [--] Global test environment tear-down
> [==] 11 tests from 4 test cases ran. (1 ms total)
> [  PASSED  ] 0 tests.
> [  FAILED  ] 11 tests, listed below:
> [  FAILED  ] TestAllTypesPlain.NoopConstructDestruct
> [  FAILED  ] TestAllTypesPlain.TestBatchRead
> [  FAILED  ] TestAllTypesPlain.TestFlatScannerInt32
> [  FAILED  ] TestAllTypesPlain.TestSetScannerBatchSize
> [  FAILED  ] TestAllTypesPlain.DebugPrintWorks
> [  FAILED  ] TestAllTypesPlain.ColumnSelection
> [  FAILED  ] TestAllTypesPlain.ColumnSelectionOutOfRange
> [  FAILED  ] TestLocalFile.FileClosedOnDestruction
> [  FAILED  ] TestLocalFile.OpenWithMetadata
> [  FAILED  ] TestFileReaderAdHoc.NationDictTruncatedDataPage
> [  FAILED  ] TestJSONWithLocalFile.JSONOutput
> 11 FAILED TESTS
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PARQUET-1071) [C++] parquet::arrow::FileWriter::Close is not idempotent

2018-03-28 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/PARQUET-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417676#comment-16417676
 ] 

ASF GitHub Bot commented on PARQUET-1071:
-

xhochy closed pull request #449: PARQUET-1071: Check that 
arrow::FileWriter::Close() is idempotent
URL: https://github.com/apache/parquet-cpp/pull/449
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/src/parquet/arrow/arrow-reader-writer-test.cc 
b/src/parquet/arrow/arrow-reader-writer-test.cc
index 8051ff17..46218be7 100644
--- a/src/parquet/arrow/arrow-reader-writer-test.cc
+++ b/src/parquet/arrow/arrow-reader-writer-test.cc
@@ -574,6 +574,8 @@ class TestParquetIO : public ::testing::Test {
 ASSERT_OK_NO_THROW(writer.NewRowGroup(values->length()));
 ASSERT_OK_NO_THROW(writer.WriteColumnChunk(*values));
 ASSERT_OK_NO_THROW(writer.Close());
+// writer.Close() should be idempotent
+ASSERT_OK_NO_THROW(writer.Close());
   }
 
   std::shared_ptr sink_;


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [C++] parquet::arrow::FileWriter::Close is not idempotent
> -
>
> Key: PARQUET-1071
> URL: https://issues.apache.org/jira/browse/PARQUET-1071
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-cpp
>Affects Versions: cpp-1.2.0
>Reporter: Wes McKinney
>Priority: Major
> Fix For: cpp-1.5.0
>
>
> Encountered a segfault when calling multiple times from Python



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (PARQUET-1071) [C++] parquet::arrow::FileWriter::Close is not idempotent

2018-03-28 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/PARQUET-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved PARQUET-1071.
--
   Resolution: Fixed
Fix Version/s: cpp-1.5.0

Issue resolved by pull request 449
[https://github.com/apache/parquet-cpp/pull/449]

> [C++] parquet::arrow::FileWriter::Close is not idempotent
> -
>
> Key: PARQUET-1071
> URL: https://issues.apache.org/jira/browse/PARQUET-1071
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-cpp
>Affects Versions: cpp-1.2.0
>Reporter: Wes McKinney
>Priority: Major
> Fix For: cpp-1.5.0
>
>
> Encountered a segfault when calling multiple times from Python



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

parquet sync happening now

2018-03-28 Thread Julien Le Dem

https://meet.google.com/xpc-gwie-sem

[jira] [Commented] (PARQUET-1222) Specify a well-defined sorting order for float and double types

2018-03-28 Thread Zoltan Ivanfi (JIRA)


[ 
https://issues.apache.org/jira/browse/PARQUET-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417460#comment-16417460
 ] 

Zoltan Ivanfi commented on PARQUET-1222:


I updated this JIRA to distuingish it from PARQUET-1251. To summarize:
 * PARQUET-1251 is a "hotfix" that describes a workaround for handling 
statistics written using the ambiguous specification.
 * This JIRA is about specifying a well-defined sort order.

> Specify a well-defined sorting order for float and double types
> ---
>
> Key: PARQUET-1222
> URL: https://issues.apache.org/jira/browse/PARQUET-1222
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-format
>Reporter: Zoltan Ivanfi
>Priority: Critical
>
> Currently parquet-format specifies the sort order for floating point numbers 
> as follows:
> {code:java}
>*   FLOAT - signed comparison of the represented value
>*   DOUBLE - signed comparison of the represented value
> {code}
> The problem is that the comparison of floating point numbers is only a 
> partial ordering with strange behaviour in specific corner cases. For 
> example, according to IEEE 754, -0 is neither less nor more than +0 and 
> comparing NaN to anything always returns false. This ordering is not suitable 
> for statistics. Additionally, the Java implementation already uses a 
> different (total) ordering that handles these cases correctly but differently 
> than the C++ implementations, which leads to interoperability problems.
> TypeDefinedOrder for doubles and floats should be deprecated and a new 
> TotalFloatingPointOrder should be introduced. The default for writing doubles 
> and floats would be the new TotalFloatingPointOrder. This ordering should be 
> effective and easy to implement in all programming languages.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (PARQUET-1222) Specify a well-defined sorting order for float and double types

2018-03-28 Thread Zoltan Ivanfi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PARQUET-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Ivanfi updated PARQUET-1222:
---
Summary: Specify a well-defined sorting order for float and double types  
(was: Definition of float and double sort order is ambiguous)

> Specify a well-defined sorting order for float and double types
> ---
>
> Key: PARQUET-1222
> URL: https://issues.apache.org/jira/browse/PARQUET-1222
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-format
>Reporter: Zoltan Ivanfi
>Priority: Critical
>
> Currently parquet-format specifies the sort order for floating point numbers 
> as follows:
> {code:java}
>*   FLOAT - signed comparison of the represented value
>*   DOUBLE - signed comparison of the represented value
> {code}
> The problem is that the comparison of floating point numbers is only a 
> partial ordering with strange behaviour in specific corner cases. For 
> example, according to IEEE 754, -0 is neither less nor more than \+0 and 
> comparing NaN to anything always returns false. This ordering is not suitable 
> for statistics. Additionally, the Java implementation already uses a 
> different (total) ordering that handles these cases correctly but differently 
> than the C\+\+ implementations, which leads to interoperability problems.
> TypeDefinedOrder for doubles and floats should be deprecated and a new 
> TotalFloatingPointOrder should be introduced. The default for writing doubles 
> and floats would be the new TotalFloatingPointOrder. This ordering should be 
> effective and easy to implement in all programming languages.
> For reading existing stats created using TypeDefinedOrder, the following 
> compatibility rules should be applied:
> * When looking for NaN values, min and max should be ignored.
> * If the min is a NaN, it should be ignored.
> * If the max is a NaN, it should be ignored.
> * If the min is \+0, the row group may contain -0 values as well.
> * If the max is -0, the row group may contain \+0 values as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PARQUET-1260) Add Zoltan Ivanfi's code signing key to the KEYS file

2018-03-28 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/PARQUET-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417440#comment-16417440
 ] 

ASF GitHub Bot commented on PARQUET-1260:
-

zivanfi opened a new pull request #91: PARQUET-1260: Add Zoltan Ivanfi's code 
signing key to the KEYS file
URL: https://github.com/apache/parquet-format/pull/91
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Zoltan Ivanfi's code signing key to the KEYS file
> -
>
> Key: PARQUET-1260
> URL: https://issues.apache.org/jira/browse/PARQUET-1260
> Project: Parquet
>  Issue Type: Task
>Reporter: Zoltan Ivanfi
>Priority: Major
>
> To make a release, I would need to have my gpg key added to the KEYS file. I 
> can add it to the repos in a commit, but I don't know how to update 
> [https://dist.apache.org/repos/dist/dev/parquet/KEYS]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (PARQUET-1260) Add Zoltan Ivanfi's code signing key to the KEYS file

2018-03-28 Thread Zoltan Ivanfi (JIRA)


 [ 
https://issues.apache.org/jira/browse/PARQUET-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Ivanfi reassigned PARQUET-1260:
--

Assignee: Zoltan Ivanfi

> Add Zoltan Ivanfi's code signing key to the KEYS file
> -
>
> Key: PARQUET-1260
> URL: https://issues.apache.org/jira/browse/PARQUET-1260
> Project: Parquet
>  Issue Type: Task
>Reporter: Zoltan Ivanfi
>Assignee: Zoltan Ivanfi
>Priority: Major
>
> To make a release, I would need to have my gpg key added to the KEYS file. I 
> can add it to the repos in a commit, but I don't know how to update 
> [https://dist.apache.org/repos/dist/dev/parquet/KEYS]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (PARQUET-1260) Add Zoltan Ivanfi's code signing key to the KEYS file

2018-03-28 Thread Zoltan Ivanfi (JIRA)

Zoltan Ivanfi created PARQUET-1260:
--

 Summary: Add Zoltan Ivanfi's code signing key to the KEYS file
 Key: PARQUET-1260
 URL: https://issues.apache.org/jira/browse/PARQUET-1260
 Project: Parquet
  Issue Type: Task
Reporter: Zoltan Ivanfi


To make a release, I would need to have my gpg key added to the KEYS file. I 
can add it to the repos in a commit, but I don't know how to update 
[https://dist.apache.org/repos/dist/dev/parquet/KEYS]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (PARQUET-1259) Parquet-protobuf support both protobuf 2 and protobuf 3

2018-03-28 Thread Qinghui Xu (JIRA)

Qinghui Xu created PARQUET-1259:
---

 Summary: Parquet-protobuf support both protobuf 2 and protobuf 3
 Key: PARQUET-1259
 URL: https://issues.apache.org/jira/browse/PARQUET-1259
 Project: Parquet
  Issue Type: New Feature
Affects Versions: 1.10.0, 1.9.1
Reporter: Qinghui Xu


With the merge of pull request: 
[https://github.com/apache/parquet-mr/pull/407,] now it is protobuf 3 used in 
parquet-protobuf, and this implies that it cannot work in an environment where 
people are using protobuf 2 in their own dependencies because there is some new 
API / breaking change in protobuf 3. People have to face a dependency version 
conflict with next parquet-protobuf release (e.g. 1.9.1 or 1.10.0).

What if we support both protobuf 2 and protobuf 3 by providing parquet-protobuf 
and parquet-protobuf2?

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PARQUET-1258) Update scm developer connection to github

2018-03-28 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/PARQUET-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417364#comment-16417364
 ] 

ASF GitHub Bot commented on PARQUET-1258:
-

zivanfi closed pull request #90: PARQUET-1258: Update scm developer connection 
to github
URL: https://github.com/apache/parquet-format/pull/90
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/pom.xml b/pom.xml
index 9689aec0..46efd041 100644
--- a/pom.xml
+++ b/pom.xml
@@ -38,7 +38,7 @@
   
 scm:git:g...@github.com:apache/parquet-format.git
 scm:git:g...@github.com:apache/parquet-format.git
-
scm:git:https://git-wip-us.apache.org/repos/asf/parquet-format.git
+
scm:git:g...@github.com:apache/parquet-format.git
 HEAD
   
 


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Update scm developer connection to github
> -
>
> Key: PARQUET-1258
> URL: https://issues.apache.org/jira/browse/PARQUET-1258
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-format, parquet-mr
>Affects Versions: 1.10.0, format-2.5.0
>Reporter: Gabor Szadovszky
>Assignee: Gabor Szadovszky
>Priority: Minor
> Fix For: 1.10.0, format-2.5.0
>
>
> After moving to gitbox the old apache repo 
> (https://git-wip-us.apache.org/repos/asf/parquet-format.git) is not working 
> anymore. The pom.xml shall be updated accordingly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Plan for a Parquet new release and writing Parquet file with outputstream

2018-03-28 Thread Jean-Baptiste Onofré

Hi Ryan,

sorry to have been quite, but I was busy traveling recently :)

Just a quick update about this one:

- I asked a guy from my team to work with me on the Beam ParquetIO. We're also
seeing several users expected this new IO.
- I will update my current PR to use Parquet SNAPSHOT and verify that
OutputFile/InputFile are convenient for Beam use case. I should be able to do it
tomorrow.
- Then, if OutFile/InputFile are OK for ParquetIO, I will let you know and
kindly ask for a Parquet release.

Is it OK for you ?

Thanks !
Regards
JB

On 02/14/2018 02:01 AM, Ryan Blue wrote:
> Jean-Baptiste,
> 
> We're planning a release that will include the new OutputFile class, which
> I think you should be able to use. Is there anything you'd change to make
> this work more easily with Beam?
> 
> rb
> 
> On Tue, Feb 13, 2018 at 12:31 PM, Jean-Baptiste Onofré 
> wrote:
> 
>> Hi guys,
>>
>> I'm working on the Apache Beam ParquetIO:
>>
>> https://github.com/apache/beam/pull/1851
>>
>> In Beam, thanks to FileIO, we support several filesystems (HDFS, S3, ...).
>>
>> If I was able to implement the Read part using AvroParquetReader
>> leveraging Beam
>>  FileIO, I'm struggling on the writing part.
>>
>> I have to create ParquetSink implementing FileIO.Sink. Especially, I have
>> to
>> implement the open(WritableByteChannel channel) method.
>>
>> It's not possible to use AvroParquetWriter here as it takes a Path as
>> argument
>> (and from the channel, I can only have an OutputStream).
>>
>> As a workaround, I wanted to use org.apache.parquet.hadoop.
>> ParquetFileWriter,
>> providing my own implementation of org.apache.parquet.io.OutputFile.
>>
>> Unfortunately OutputFile (and the updated method in ParquetFileWriter)
>> exists on
>> Parquet master branch, but it was different on Parquet 1.9.0.
>>
>> So, I have two questions:
>> - do you plan a Parquet 1.9.1 release including org.apache.parquet.io.
>> OutputFile
>> and updated org.apache.parquet.hadoop.ParquetFileWriter ?
>> - using Parquet 1.9.0, do you have any advice how to use
>> AvroParquetWriter/ParquetFileWriter with an OutputStream (or any object
>> that I
>> can get from WritableByteChannel) ?
>>
>> Thanks !
>>
>> Regards
>> JB
>> --
>> Jean-Baptiste Onofré
>> jbono...@apache.org
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com
>>
> 
> 
> 

-- 
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

[jira] [Commented] (PARQUET-1143) Update Java for format 2.4.0 changes

2018-03-28 Thread Mark Marsh (JIRA)


[ 
https://issues.apache.org/jira/browse/PARQUET-1143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417340#comment-16417340
 ] 

Mark Marsh commented on PARQUET-1143:
-

I'm also happy to test release candidates against my use case if it will help 
getting 1.10.0 out.

I'm currently developing against the master branch but it will be problematic 
to push that through QA...

> Update Java for format 2.4.0 changes
> 
>
> Key: PARQUET-1143
> URL: https://issues.apache.org/jira/browse/PARQUET-1143
> Project: Parquet
>  Issue Type: Task
>  Components: parquet-mr
>Affects Versions: 1.9.0, 1.8.2
>Reporter: Ryan Blue
>Assignee: Ryan Blue
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PARQUET-1258) Update scm developer connection to github

2018-03-28 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/PARQUET-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417328#comment-16417328
 ] 

ASF GitHub Bot commented on PARQUET-1258:
-

gszadovszky opened a new pull request #462: PARQUET-1258: Update scm developer 
connection to github
URL: https://github.com/apache/parquet-mr/pull/462
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Update scm developer connection to github
> -
>
> Key: PARQUET-1258
> URL: https://issues.apache.org/jira/browse/PARQUET-1258
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-format, parquet-mr
>Affects Versions: 1.10.0, format-2.5.0
>Reporter: Gabor Szadovszky
>Assignee: Gabor Szadovszky
>Priority: Minor
> Fix For: 1.10.0, format-2.5.0
>
>
> After moving to gitbox the old apache repo 
> (https://git-wip-us.apache.org/repos/asf/parquet-format.git) is not working 
> anymore. The pom.xml shall be updated accordingly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PARQUET-1258) Update scm developer connection to github

2018-03-28 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/PARQUET-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417324#comment-16417324
 ] 

ASF GitHub Bot commented on PARQUET-1258:
-

gszadovszky opened a new pull request #90: PARQUET-1258: Update scm developer 
connection to github
URL: https://github.com/apache/parquet-format/pull/90
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Update scm developer connection to github
> -
>
> Key: PARQUET-1258
> URL: https://issues.apache.org/jira/browse/PARQUET-1258
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-format, parquet-mr
>Affects Versions: 1.10.0, format-2.5.0
>Reporter: Gabor Szadovszky
>Assignee: Gabor Szadovszky
>Priority: Minor
>
> After moving to gitbox the old apache repo 
> (https://git-wip-us.apache.org/repos/asf/parquet-format.git) is not working 
> anymore. The pom.xml shall be updated accordingly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (PARQUET-1258) Update scm developer connection to github

2018-03-28 Thread Gabor Szadovszky (JIRA)

Gabor Szadovszky created PARQUET-1258:
-

 Summary: Update scm developer connection to github
 Key: PARQUET-1258
 URL: https://issues.apache.org/jira/browse/PARQUET-1258
 Project: Parquet
  Issue Type: Bug
  Components: parquet-format, parquet-mr
Affects Versions: 1.10.0, format-2.5.0
Reporter: Gabor Szadovszky
Assignee: Gabor Szadovszky


After moving to gitbox the old apache repo 
(https://git-wip-us.apache.org/repos/asf/parquet-format.git) is not working 
anymore. The pom.xml shall be updated accordingly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (PARQUET-1253) Support for new logical type representation

2018-03-28 Thread Nandor Kollar (JIRA)


 [ 
https://issues.apache.org/jira/browse/PARQUET-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar reassigned PARQUET-1253:
--

Assignee: Nandor Kollar

> Support for new logical type representation
> ---
>
> Key: PARQUET-1253
> URL: https://issues.apache.org/jira/browse/PARQUET-1253
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-mr
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Major
>
> Latest parquet-format 
> [introduced|https://github.com/apache/parquet-format/commit/863875e0be3237c6aa4ed71733d54c91a51deabe#diff-0f9d1b5347959e15259da7ba8f4b6252]
>  a new representation for logical types. As of now this is not yet supported 
> in parquet-mr, thus there's no way to use parametrized UTC normalized 
> timestamp data types. When reading and writing Parquet files, besides 
> 'converted_type' parquet-mr should use the new 'logicalType' field in 
> SchemaElement to tell the current logical type annotation. To maintain 
> backward compatibility, the semantic of converted_type shouldn't change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (PARQUET-1257) GetRecordBatchReader in parquet/arrow/reader.h should be able to specify chunksize

2018-03-28 Thread Xianjin YE (JIRA)

Xianjin YE created PARQUET-1257:
---

 Summary: GetRecordBatchReader in parquet/arrow/reader.h should be 
able to specify chunksize
 Key: PARQUET-1257
 URL: https://issues.apache.org/jira/browse/PARQUET-1257
 Project: Parquet
  Issue Type: Improvement
  Components: parquet-cpp
Reporter: Xianjin YE


see [https://github.com/apache/parquet-cpp/pull/445] comments



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PARQUET-968) Add Hive/Presto support in ProtoParquet

[jira] [Comment Edited] (PARQUET-1259) Parquet-protobuf support both protobuf 2 and protobuf 3

[jira] [Comment Edited] (PARQUET-1259) Parquet-protobuf support both protobuf 2 and protobuf 3

[jira] [Commented] (PARQUET-1259) Parquet-protobuf support both protobuf 2 and protobuf 3

[jira] [Assigned] (PARQUET-1255) [C++] Exceptions thrown in some tests

[jira] [Commented] (PARQUET-1255) [C++] Exceptions thrown in some tests

[jira] [Assigned] (PARQUET-1071) [C++] parquet::arrow::FileWriter::Close is not idempotent

[jira] [Resolved] (PARQUET-1255) [C++] Exceptions thrown in some tests

[jira] [Commented] (PARQUET-1071) [C++] parquet::arrow::FileWriter::Close is not idempotent

[jira] [Resolved] (PARQUET-1071) [C++] parquet::arrow::FileWriter::Close is not idempotent

parquet sync happening now

[jira] [Commented] (PARQUET-1222) Specify a well-defined sorting order for float and double types

[jira] [Updated] (PARQUET-1222) Specify a well-defined sorting order for float and double types

[jira] [Commented] (PARQUET-1260) Add Zoltan Ivanfi's code signing key to the KEYS file

[jira] [Assigned] (PARQUET-1260) Add Zoltan Ivanfi's code signing key to the KEYS file

[jira] [Created] (PARQUET-1260) Add Zoltan Ivanfi's code signing key to the KEYS file

[jira] [Created] (PARQUET-1259) Parquet-protobuf support both protobuf 2 and protobuf 3

[jira] [Commented] (PARQUET-1258) Update scm developer connection to github

Re: Plan for a Parquet new release and writing Parquet file with outputstream

[jira] [Commented] (PARQUET-1143) Update Java for format 2.4.0 changes

[jira] [Commented] (PARQUET-1258) Update scm developer connection to github

[jira] [Commented] (PARQUET-1258) Update scm developer connection to github

[jira] [Created] (PARQUET-1258) Update scm developer connection to github

[jira] [Assigned] (PARQUET-1253) Support for new logical type representation

[jira] [Created] (PARQUET-1257) GetRecordBatchReader in parquet/arrow/reader.h should be able to specify chunksize

25 matches

Site Navigation

Mail list logo

Footer information