[jira] [Commented] (PARQUET-1286) Crypto package in parquet-mr

2018-05-14 Thread Gidon Gershinsky (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-1286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474724#comment-16474724
 ] 

Gidon Gershinsky commented on PARQUET-1286:
---

Hi [~mdeepak], sure, I'll post the available cpp code.

> Crypto package in parquet-mr
> 
>
> Key: PARQUET-1286
> URL: https://issues.apache.org/jira/browse/PARQUET-1286
> Project: Parquet
>  Issue Type: Sub-task
>  Components: parquet-mr
>Reporter: Gidon Gershinsky
>Assignee: Gidon Gershinsky
>Priority: Major
>
> The implementation of Parquet encryptors and decryptors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PARQUET-1266) LogicalTypes union in parquet-format doesn't include UUID

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474432#comment-16474432
 ] 

ASF GitHub Bot commented on PARQUET-1266:
-

zivanfi closed pull request #93: PARQUET-1266: LogicalTypes union in 
parquet-format doesn't include UUID
URL: https://github.com/apache/parquet-format/pull/93
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/src/main/thrift/parquet.thrift b/src/main/thrift/parquet.thrift
index f3aac258..3b15cfef 100644
--- a/src/main/thrift/parquet.thrift
+++ b/src/main/thrift/parquet.thrift
@@ -318,7 +318,7 @@ struct BsonType {
  * following table.
  */
 union LogicalType {
-  1:  StringType STRING   // use ConvertedType UTF8 if encoding is UTF-8
+  1:  StringType STRING   // use ConvertedType UTF8
   2:  MapType MAP // use ConvertedType MAP
   3:  ListType LIST   // use ConvertedType LIST
   4:  EnumType ENUM   // use ConvertedType ENUM
@@ -331,6 +331,7 @@ union LogicalType {
   11: NullType UNKNOWN// no compatible ConvertedType
   12: JsonType JSON   // use ConvertedType JSON
   13: BsonType BSON   // use ConvertedType BSON
+  14: UUIDType UUID
 }
 
 /**
@@ -381,7 +382,7 @@ struct SchemaElement {
   9: optional i32 field_id;
 
   /**
-   * The logical type of this SchemaElement; only valid for primitives.
+   * The logical type of this SchemaElement
*
* LogicalType replaces ConvertedType, but ConvertedType is still required
* for some logical types to ensure forward-compatibility in format v1.


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> LogicalTypes union in parquet-format doesn't include UUID
> -
>
> Key: PARQUET-1266
> URL: https://issues.apache.org/jira/browse/PARQUET-1266
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-format
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Minor
>
> parquet-format new logical type representation doesn't include UUID type



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PARQUET-1294) Update release scripts for the new Apache policy

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474429#comment-16474429
 ] 

ASF GitHub Bot commented on PARQUET-1294:
-

zivanfi closed pull request #97: PARQUET-1294: Update release scripts for the 
new Apache policy
URL: https://github.com/apache/parquet-format/pull/97
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/dev/source-release.sh b/dev/source-release.sh
index a58c520d..8d4e281c 100644
--- a/dev/source-release.sh
+++ b/dev/source-release.sh
@@ -58,8 +58,7 @@ git archive $release_hash --prefix $tag/ -o $tarball
 
 # sign the archive
 gpg --armor --output ${tarball}.asc --detach-sig $tarball
-gpg --print-md MD5 $tarball > ${tarball}.md5
-shasum $tarball > ${tarball}.sha
+shasum -a 512 $tarball > ${tarball}.sha512
 
 # check out the parquet RC folder
 svn co --depth=empty https://dist.apache.org/repos/dist/dev/parquet tmp


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Update release scripts for the new Apache policy
> 
>
> Key: PARQUET-1294
> URL: https://issues.apache.org/jira/browse/PARQUET-1294
> Project: Parquet
>  Issue Type: Task
>  Components: parquet-format, parquet-mr
>Reporter: Gabor Szadovszky
>Assignee: Gabor Szadovszky
>Priority: Major
>
> The Apache policy about the checksums is changed recently so it is required 
> to update to related release scripts. See the policy detailed here: 
> http://www.apache.org/dev/release-distribution#sigs-and-sums



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PARQUET-1296) Travis kills build after 10 minutes, because "no output was received"

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474416#comment-16474416
 ] 

ASF GitHub Bot commented on PARQUET-1296:
-

zivanfi commented on issue #476: PARQUET-1296: Travis kills build after 10 
minutes, because "no output…
URL: https://github.com/apache/parquet-mr/pull/476#issuecomment-388784482
 
 
   How about adding `sudo apt-get install pv` to the `before_install` section 
then using `| pv -fbi 60 > mvn_install.log` instead of ` > mvn_install.log`? 
This would display the size of the output every 60 seconds, so an execution 
taking 6 minutes would look something like:
   
190 B
530 B
770 B
   1.10KiB
   1.43KiB
   1.78KiB
   
   A few extra lines would end up in the travis log, but nothing too 
distractive and it would be enough to prevent timeouts due to the lack of 
output.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Travis kills build after 10 minutes, because "no output was received"
> -
>
> Key: PARQUET-1296
> URL: https://issues.apache.org/jira/browse/PARQUET-1296
> Project: Parquet
>  Issue Type: Bug
>Affects Versions: 1.8.2
>Reporter: Nandor Kollar
>Priority: Major
>
> {{mvn install --batch-mode -DskipTests=true -Dmaven.javadoc.skip=true 
> -Dsource.skip=true > mvn_install.log || (cat mvn_install.log && false)}} 
> could take more than 10 minutes, and Travis 
> [kills|https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received]
>  the build in this case, since no output is produced (it is redirected to the 
> log file)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PARQUET-1296) Travis kills build after 10 minutes, because "no output was received"

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474414#comment-16474414
 ] 

ASF GitHub Bot commented on PARQUET-1296:
-

zivanfi commented on issue #476: PARQUET-1296: Travis kills build after 10 
minutes, because "no output…
URL: https://github.com/apache/parquet-mr/pull/476#issuecomment-388873937
 
 
   @xhochy Yes, but according to the latest Travis build getting `travis_wait` 
to work with our command is non-trivial.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Travis kills build after 10 minutes, because "no output was received"
> -
>
> Key: PARQUET-1296
> URL: https://issues.apache.org/jira/browse/PARQUET-1296
> Project: Parquet
>  Issue Type: Bug
>Affects Versions: 1.8.2
>Reporter: Nandor Kollar
>Priority: Major
>
> {{mvn install --batch-mode -DskipTests=true -Dmaven.javadoc.skip=true 
> -Dsource.skip=true > mvn_install.log || (cat mvn_install.log && false)}} 
> could take more than 10 minutes, and Travis 
> [kills|https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received]
>  the build in this case, since no output is produced (it is redirected to the 
> log file)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PARQUET-1253) Support for new logical type representation

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474331#comment-16474331
 ] 

ASF GitHub Bot commented on PARQUET-1253:
-

gszadovszky commented on a change in pull request #463: PARQUET-1253: Support 
for new logical type representation
URL: https://github.com/apache/parquet-mr/pull/463#discussion_r187987942
 
 

 ##
 File path: 
parquet-column/src/main/java/org/apache/parquet/schema/LogicalTypeAnnotation.java
 ##
 @@ -164,7 +135,7 @@ protected LogicalTypeAnnotation fromString(List 
params) {
*/
   public abstract void accept(LogicalTypeAnnotationVisitor 
logicalTypeAnnotationVisitor);
 
-  public abstract LogicalTypes getType();
+  protected abstract LogicalTypeToken getType();
 
 Review comment:
   Good point. I would say if something is not necessary to be public then 
restrict its access before committing. The later the harder to remove from the 
public API.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support for new logical type representation
> ---
>
> Key: PARQUET-1253
> URL: https://issues.apache.org/jira/browse/PARQUET-1253
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-mr
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Major
>
> Latest parquet-format 
> [introduced|https://github.com/apache/parquet-format/commit/863875e0be3237c6aa4ed71733d54c91a51deabe#diff-0f9d1b5347959e15259da7ba8f4b6252]
>  a new representation for logical types. As of now this is not yet supported 
> in parquet-mr, thus there's no way to use parametrized UTC normalized 
> timestamp data types. When reading and writing Parquet files, besides 
> 'converted_type' parquet-mr should use the new 'logicalType' field in 
> SchemaElement to tell the current logical type annotation. To maintain 
> backward compatibility, the semantic of converted_type shouldn't change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PARQUET-1253) Support for new logical type representation

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474271#comment-16474271
 ] 

ASF GitHub Bot commented on PARQUET-1253:
-

nandorKollar commented on a change in pull request #463: PARQUET-1253: Support 
for new logical type representation
URL: https://github.com/apache/parquet-mr/pull/463#discussion_r187972568
 
 

 ##
 File path: 
parquet-column/src/main/java/org/apache/parquet/schema/LogicalTypeAnnotation.java
 ##
 @@ -164,7 +135,7 @@ protected LogicalTypeAnnotation fromString(List 
params) {
*/
   public abstract void accept(LogicalTypeAnnotationVisitor 
logicalTypeAnnotationVisitor);
 
-  public abstract LogicalTypes getType();
+  protected abstract LogicalTypeToken getType();
 
 Review comment:
   Indeed, thanks! I'm wondering, if we should narrow down the scope of a bunch 
of other methods to package private in LogicalTypeAnnotation too? For example, 
toOriginalType is only used within the same package, apart from one test case, 
not sure that it should be part of public API.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support for new logical type representation
> ---
>
> Key: PARQUET-1253
> URL: https://issues.apache.org/jira/browse/PARQUET-1253
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-mr
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Major
>
> Latest parquet-format 
> [introduced|https://github.com/apache/parquet-format/commit/863875e0be3237c6aa4ed71733d54c91a51deabe#diff-0f9d1b5347959e15259da7ba8f4b6252]
>  a new representation for logical types. As of now this is not yet supported 
> in parquet-mr, thus there's no way to use parametrized UTC normalized 
> timestamp data types. When reading and writing Parquet files, besides 
> 'converted_type' parquet-mr should use the new 'logicalType' field in 
> SchemaElement to tell the current logical type annotation. To maintain 
> backward compatibility, the semantic of converted_type shouldn't change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PARQUET-1253) Support for new logical type representation

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474268#comment-16474268
 ] 

ASF GitHub Bot commented on PARQUET-1253:
-

nandorKollar commented on a change in pull request #463: PARQUET-1253: Support 
for new logical type representation
URL: https://github.com/apache/parquet-mr/pull/463#discussion_r187970641
 
 

 ##
 File path: 
parquet-column/src/main/java/org/apache/parquet/schema/LogicalTypeAnnotation.java
 ##
 @@ -19,28 +19,13 @@
 package org.apache.parquet.schema;
 
 import org.apache.parquet.Preconditions;
-import org.apache.parquet.format.BsonType;
-import org.apache.parquet.format.ConvertedType;
-import org.apache.parquet.format.DateType;
-import org.apache.parquet.format.DecimalType;
-import org.apache.parquet.format.EnumType;
-import org.apache.parquet.format.IntType;
-import org.apache.parquet.format.JsonType;
-import org.apache.parquet.format.ListType;
-import org.apache.parquet.format.LogicalType;
-import org.apache.parquet.format.MapType;
-import org.apache.parquet.format.MicroSeconds;
-import org.apache.parquet.format.MilliSeconds;
-import org.apache.parquet.format.NullType;
-import org.apache.parquet.format.StringType;
-import org.apache.parquet.format.TimeType;
-import org.apache.parquet.format.TimestampType;
 
 import java.util.List;
 import java.util.Objects;
 
 public abstract class LogicalTypeAnnotation {
-  public enum LogicalTypes {
+  // This is a private enum intended only for internal use for parsing the 
schema
+  public enum LogicalTypeToken {
 
 Review comment:
   Good idea, thanks Gabor!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support for new logical type representation
> ---
>
> Key: PARQUET-1253
> URL: https://issues.apache.org/jira/browse/PARQUET-1253
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-mr
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Major
>
> Latest parquet-format 
> [introduced|https://github.com/apache/parquet-format/commit/863875e0be3237c6aa4ed71733d54c91a51deabe#diff-0f9d1b5347959e15259da7ba8f4b6252]
>  a new representation for logical types. As of now this is not yet supported 
> in parquet-mr, thus there's no way to use parametrized UTC normalized 
> timestamp data types. When reading and writing Parquet files, besides 
> 'converted_type' parquet-mr should use the new 'logicalType' field in 
> SchemaElement to tell the current logical type annotation. To maintain 
> backward compatibility, the semantic of converted_type shouldn't change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (PARQUET-1299) [C++] Upgrade Brotli to latest version

2018-05-14 Thread Phillip Cloud (JIRA)
Phillip Cloud created PARQUET-1299:
--

 Summary: [C++] Upgrade Brotli to latest version
 Key: PARQUET-1299
 URL: https://issues.apache.org/jira/browse/PARQUET-1299
 Project: Parquet
  Issue Type: Improvement
  Components: parquet-cpp
Affects Versions: cpp-1.3.1
Reporter: Phillip Cloud
Assignee: Phillip Cloud






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PARQUET-1296) Travis kills build after 10 minutes, because "no output was received"

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474104#comment-16474104
 ] 

ASF GitHub Bot commented on PARQUET-1296:
-

xhochy commented on issue #476: PARQUET-1296: Travis kills build after 10 
minutes, because "no output…
URL: https://github.com/apache/parquet-mr/pull/476#issuecomment-388799381
 
 
   @zivanfi This is nearly the same thing as `travis_wait` does. The benefit of 
`travis_wait` is that is already installed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Travis kills build after 10 minutes, because "no output was received"
> -
>
> Key: PARQUET-1296
> URL: https://issues.apache.org/jira/browse/PARQUET-1296
> Project: Parquet
>  Issue Type: Bug
>Affects Versions: 1.8.2
>Reporter: Nandor Kollar
>Priority: Major
>
> {{mvn install --batch-mode -DskipTests=true -Dmaven.javadoc.skip=true 
> -Dsource.skip=true > mvn_install.log || (cat mvn_install.log && false)}} 
> could take more than 10 minutes, and Travis 
> [kills|https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received]
>  the build in this case, since no output is produced (it is redirected to the 
> log file)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PARQUET-1296) Travis kills build after 10 minutes, because "no output was received"

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474092#comment-16474092
 ] 

ASF GitHub Bot commented on PARQUET-1296:
-

zivanfi commented on issue #476: PARQUET-1296: Travis kills build after 10 
minutes, because "no output…
URL: https://github.com/apache/parquet-mr/pull/476#issuecomment-388784482
 
 
   How about adding `sudo apt-get install pv` to the `before_install` section 
then using `| pv -fbi 60 > mvn_install.log` instead of ` > mvn_install.log`? 
This would display the size of the output every 60 seconds, so an execution 
taking 6 minutes would look something like:
   
190 B
530 B
770 B
   1.10KiB
   1.43KiB
   1.78KiB
   
   A few extra lines would end up in the travis log, but nothing too 
distractive and it would be enough to prevent timeouts due to no output.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Travis kills build after 10 minutes, because "no output was received"
> -
>
> Key: PARQUET-1296
> URL: https://issues.apache.org/jira/browse/PARQUET-1296
> Project: Parquet
>  Issue Type: Bug
>Affects Versions: 1.8.2
>Reporter: Nandor Kollar
>Priority: Major
>
> {{mvn install --batch-mode -DskipTests=true -Dmaven.javadoc.skip=true 
> -Dsource.skip=true > mvn_install.log || (cat mvn_install.log && false)}} 
> could take more than 10 minutes, and Travis 
> [kills|https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received]
>  the build in this case, since no output is produced (it is redirected to the 
> log file)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PARQUET-1296) Travis kills build after 10 minutes, because "no output was received"

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474091#comment-16474091
 ] 

ASF GitHub Bot commented on PARQUET-1296:
-

zivanfi commented on issue #476: PARQUET-1296: Travis kills build after 10 
minutes, because "no output…
URL: https://github.com/apache/parquet-mr/pull/476#issuecomment-388784482
 
 
   How about adding `sudo apt-get install pv` to the `before_install` section 
then using `| pv -fbi 60 > mvn_install.log` instead of ` > mvn_install.log`? 
This would display the size of the output every 60 seconds like so:
   
190 B
530 B
770 B
   1.10KiB
   1.43KiB
   1.78KiB
   
   A few extra lines would end up in the travis log, but nothing too 
distractive and it would be enough to prevent timeouts due to no output.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Travis kills build after 10 minutes, because "no output was received"
> -
>
> Key: PARQUET-1296
> URL: https://issues.apache.org/jira/browse/PARQUET-1296
> Project: Parquet
>  Issue Type: Bug
>Affects Versions: 1.8.2
>Reporter: Nandor Kollar
>Priority: Major
>
> {{mvn install --batch-mode -DskipTests=true -Dmaven.javadoc.skip=true 
> -Dsource.skip=true > mvn_install.log || (cat mvn_install.log && false)}} 
> could take more than 10 minutes, and Travis 
> [kills|https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received]
>  the build in this case, since no output is produced (it is redirected to the 
> log file)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PARQUET-1286) Crypto package in parquet-mr

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-1286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474082#comment-16474082
 ] 

ASF GitHub Bot commented on PARQUET-1286:
-

majetideepak commented on issue #471: PARQUET-1286: Crypto package in parquet-mr
URL: https://github.com/apache/parquet-mr/pull/471#issuecomment-388792894
 
 
   @ggershinsky, in one of the sync calls, you mentioned that you worked on a 
C++ implementation. Can you post that work somewhere?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Crypto package in parquet-mr
> 
>
> Key: PARQUET-1286
> URL: https://issues.apache.org/jira/browse/PARQUET-1286
> Project: Parquet
>  Issue Type: Sub-task
>  Components: parquet-mr
>Reporter: Gidon Gershinsky
>Assignee: Gidon Gershinsky
>Priority: Major
>
> The implementation of Parquet encryptors and decryptors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PARQUET-1296) Travis kills build after 10 minutes, because "no output was received"

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474057#comment-16474057
 ] 

ASF GitHub Bot commented on PARQUET-1296:
-

zivanfi commented on issue #476: PARQUET-1296: Travis kills build after 10 
minutes, because "no output…
URL: https://github.com/apache/parquet-mr/pull/476#issuecomment-388784482
 
 
   How about adding `sudo apt-get install pv` the `before_install` then using 
`| pv -fbi 60 > mvn_install.log` instead of ` > mvn_install.log`? This will 
display the size of the output every 60 seconds like so:
   
190 B
530 B
770 B
   1.10KiB
   1.43KiB
   1.78KiB
   
   A few extra lines would end up in the travis log, but nothing too 
distractive and it would be enough to prevent timeouts due to no output.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Travis kills build after 10 minutes, because "no output was received"
> -
>
> Key: PARQUET-1296
> URL: https://issues.apache.org/jira/browse/PARQUET-1296
> Project: Parquet
>  Issue Type: Bug
>Affects Versions: 1.8.2
>Reporter: Nandor Kollar
>Priority: Major
>
> {{mvn install --batch-mode -DskipTests=true -Dmaven.javadoc.skip=true 
> -Dsource.skip=true > mvn_install.log || (cat mvn_install.log && false)}} 
> could take more than 10 minutes, and Travis 
> [kills|https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received]
>  the build in this case, since no output is produced (it is redirected to the 
> log file)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (PARQUET-1298) Project download page does not conform to Apache requirements

2018-05-14 Thread Zoltan Ivanfi (JIRA)
Zoltan Ivanfi created PARQUET-1298:
--

 Summary: Project download page does not conform to Apache 
requirements
 Key: PARQUET-1298
 URL: https://issues.apache.org/jira/browse/PARQUET-1298
 Project: Parquet
  Issue Type: Task
Reporter: Zoltan Ivanfi


Our last two release announcements were rejected on the following grounds:
{quote}Sorry, but the download page does not contain the required links to 
KEYS, sigs and hashes.

Please see:

[http://www.apache.org/dev/release-distribution#download-links]
and the previous section

There is a project download page at:
[http://parquet.apache.org/downloads/]

but it is rather out of date and does not have the required links either.
Note that the ASF releases open source, so the download page must include that.
{quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PARQUET-1296) Travis kills build after 10 minutes, because "no output was received"

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16473872#comment-16473872
 ] 

ASF GitHub Bot commented on PARQUET-1296:
-

gszadovszky commented on issue #476: PARQUET-1296: Travis kills build after 10 
minutes, because "no output…
URL: https://github.com/apache/parquet-mr/pull/476#issuecomment-388726601
 
 
   Thanks, @xhochy for the explanation. I'm OK with the actual solution, then. 
(After it is working properly...)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Travis kills build after 10 minutes, because "no output was received"
> -
>
> Key: PARQUET-1296
> URL: https://issues.apache.org/jira/browse/PARQUET-1296
> Project: Parquet
>  Issue Type: Bug
>Affects Versions: 1.8.2
>Reporter: Nandor Kollar
>Priority: Major
>
> {{mvn install --batch-mode -DskipTests=true -Dmaven.javadoc.skip=true 
> -Dsource.skip=true > mvn_install.log || (cat mvn_install.log && false)}} 
> could take more than 10 minutes, and Travis 
> [kills|https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received]
>  the build in this case, since no output is produced (it is redirected to the 
> log file)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PARQUET-1253) Support for new logical type representation

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16473867#comment-16473867
 ] 

ASF GitHub Bot commented on PARQUET-1253:
-

gszadovszky commented on a change in pull request #463: PARQUET-1253: Support 
for new logical type representation
URL: https://github.com/apache/parquet-mr/pull/463#discussion_r187858363
 
 

 ##
 File path: 
parquet-column/src/main/java/org/apache/parquet/schema/LogicalTypeAnnotation.java
 ##
 @@ -164,7 +135,7 @@ protected LogicalTypeAnnotation fromString(List 
params) {
*/
   public abstract void accept(LogicalTypeAnnotationVisitor 
logicalTypeAnnotationVisitor);
 
-  public abstract LogicalTypes getType();
+  protected abstract LogicalTypeToken getType();
 
 Review comment:
   nit: package private would be fine here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support for new logical type representation
> ---
>
> Key: PARQUET-1253
> URL: https://issues.apache.org/jira/browse/PARQUET-1253
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-mr
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Major
>
> Latest parquet-format 
> [introduced|https://github.com/apache/parquet-format/commit/863875e0be3237c6aa4ed71733d54c91a51deabe#diff-0f9d1b5347959e15259da7ba8f4b6252]
>  a new representation for logical types. As of now this is not yet supported 
> in parquet-mr, thus there's no way to use parametrized UTC normalized 
> timestamp data types. When reading and writing Parquet files, besides 
> 'converted_type' parquet-mr should use the new 'logicalType' field in 
> SchemaElement to tell the current logical type annotation. To maintain 
> backward compatibility, the semantic of converted_type shouldn't change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PARQUET-1253) Support for new logical type representation

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16473866#comment-16473866
 ] 

ASF GitHub Bot commented on PARQUET-1253:
-

gszadovszky commented on a change in pull request #463: PARQUET-1253: Support 
for new logical type representation
URL: https://github.com/apache/parquet-mr/pull/463#discussion_r187856940
 
 

 ##
 File path: 
parquet-column/src/main/java/org/apache/parquet/schema/LogicalTypeAnnotation.java
 ##
 @@ -19,28 +19,13 @@
 package org.apache.parquet.schema;
 
 import org.apache.parquet.Preconditions;
-import org.apache.parquet.format.BsonType;
-import org.apache.parquet.format.ConvertedType;
-import org.apache.parquet.format.DateType;
-import org.apache.parquet.format.DecimalType;
-import org.apache.parquet.format.EnumType;
-import org.apache.parquet.format.IntType;
-import org.apache.parquet.format.JsonType;
-import org.apache.parquet.format.ListType;
-import org.apache.parquet.format.LogicalType;
-import org.apache.parquet.format.MapType;
-import org.apache.parquet.format.MicroSeconds;
-import org.apache.parquet.format.MilliSeconds;
-import org.apache.parquet.format.NullType;
-import org.apache.parquet.format.StringType;
-import org.apache.parquet.format.TimeType;
-import org.apache.parquet.format.TimestampType;
 
 import java.util.List;
 import java.util.Objects;
 
 public abstract class LogicalTypeAnnotation {
-  public enum LogicalTypes {
+  // This is a private enum intended only for internal use for parsing the 
schema
+  public enum LogicalTypeToken {
 
 Review comment:
   As far as I can see this enum is used only from the schema package. I would 
suggest using package private access so the comment is not needed and the 
clients cannot misuse it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support for new logical type representation
> ---
>
> Key: PARQUET-1253
> URL: https://issues.apache.org/jira/browse/PARQUET-1253
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-mr
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
>Priority: Major
>
> Latest parquet-format 
> [introduced|https://github.com/apache/parquet-format/commit/863875e0be3237c6aa4ed71733d54c91a51deabe#diff-0f9d1b5347959e15259da7ba8f4b6252]
>  a new representation for logical types. As of now this is not yet supported 
> in parquet-mr, thus there's no way to use parametrized UTC normalized 
> timestamp data types. When reading and writing Parquet files, besides 
> 'converted_type' parquet-mr should use the new 'logicalType' field in 
> SchemaElement to tell the current logical type annotation. To maintain 
> backward compatibility, the semantic of converted_type shouldn't change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PARQUET-1296) Travis kills build after 10 minutes, because "no output was received"

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16473862#comment-16473862
 ] 

ASF GitHub Bot commented on PARQUET-1296:
-

xhochy commented on issue #476: PARQUET-1296: Travis kills build after 10 
minutes, because "no output…
URL: https://github.com/apache/parquet-mr/pull/476#issuecomment-388722823
 
 
   > Why do we log to a file to print out its content when the execution fails? 
Travis captures the output, so I don't think we need to store the logs as a 
file.
   
   Having an enormous log is quite annoying when looking for a problem in 
another part of the job. Basically it makes the Travis UI unusable and you need 
to resort to downloading the log and use grep on your machine. Only printing 
parts of the log helped me in other projects to more quickly find the problem 
and give appropriate feedback on PRs than. If you do a lot of review, this is 
quite a significant time saver.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Travis kills build after 10 minutes, because "no output was received"
> -
>
> Key: PARQUET-1296
> URL: https://issues.apache.org/jira/browse/PARQUET-1296
> Project: Parquet
>  Issue Type: Bug
>Affects Versions: 1.8.2
>Reporter: Nandor Kollar
>Priority: Major
>
> {{mvn install --batch-mode -DskipTests=true -Dmaven.javadoc.skip=true 
> -Dsource.skip=true > mvn_install.log || (cat mvn_install.log && false)}} 
> could take more than 10 minutes, and Travis 
> [kills|https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received]
>  the build in this case, since no output is produced (it is redirected to the 
> log file)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PARQUET-1296) Travis kills build after 10 minutes, because "no output was received"

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16473856#comment-16473856
 ] 

ASF GitHub Bot commented on PARQUET-1296:
-

gszadovszky commented on issue #476: PARQUET-1296: Travis kills build after 10 
minutes, because "no output…
URL: https://github.com/apache/parquet-mr/pull/476#issuecomment-388720349
 
 
   Why do we log to a file to print out its content when the execution fails? 
Travis captures the output, so I don't think we need to store the logs as a 
file.
   If we really would like to keep the logs in a file we should use `mvn 
install ... | tee mvn_install.log`, so the logs are also printed to stdout 
continuously and travis won't terminate the build.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Travis kills build after 10 minutes, because "no output was received"
> -
>
> Key: PARQUET-1296
> URL: https://issues.apache.org/jira/browse/PARQUET-1296
> Project: Parquet
>  Issue Type: Bug
>Affects Versions: 1.8.2
>Reporter: Nandor Kollar
>Priority: Major
>
> {{mvn install --batch-mode -DskipTests=true -Dmaven.javadoc.skip=true 
> -Dsource.skip=true > mvn_install.log || (cat mvn_install.log && false)}} 
> could take more than 10 minutes, and Travis 
> [kills|https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received]
>  the build in this case, since no output is produced (it is redirected to the 
> log file)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PARQUET-1296) Travis kills build after 10 minutes, because "no output was received"

2018-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16473855#comment-16473855
 ] 

ASF GitHub Bot commented on PARQUET-1296:
-

gszadovszky commented on issue #476: PARQUET-1296: Travis kills build after 10 
minutes, because "no output…
URL: https://github.com/apache/parquet-mr/pull/476#issuecomment-388720349
 
 
   Why do we log to a file to print out its content when the execution fails. 
Travis captures the output, so I don't think we need to store the logs as a 
file.
   If we really would like to keep the logs in a file we should use `mvn 
install ... | tee mvn_install.log`, so the logs are also printed to stdout 
continuously and travis won't terminate the build.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Travis kills build after 10 minutes, because "no output was received"
> -
>
> Key: PARQUET-1296
> URL: https://issues.apache.org/jira/browse/PARQUET-1296
> Project: Parquet
>  Issue Type: Bug
>Affects Versions: 1.8.2
>Reporter: Nandor Kollar
>Priority: Major
>
> {{mvn install --batch-mode -DskipTests=true -Dmaven.javadoc.skip=true 
> -Dsource.skip=true > mvn_install.log || (cat mvn_install.log && false)}} 
> could take more than 10 minutes, and Travis 
> [kills|https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received]
>  the build in this case, since no output is produced (it is redirected to the 
> log file)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)