[jira] [Commented] (ORC-175) ZLIB performance

2018-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/ORC-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16575394#comment-16575394
 ] 

ASF GitHub Bot commented on ORC-175:


Github user asfgit closed the pull request at:

https://github.com/apache/orc/pull/159


> ZLIB performance
> 
>
> Key: ORC-175
> URL: https://issues.apache.org/jira/browse/ORC-175
> Project: ORC
>  Issue Type: Improvement
>  Components: Java
>Reporter: iamhumanbeing
>Assignee: iamhumanbeing
>Priority: Major
>  Labels: performance
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-175) ZLIB performance

2017-09-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16160952#comment-16160952
 ] 

ASF GitHub Bot commented on ORC-175:


Github user xndai commented on the issue:

https://github.com/apache/orc/pull/159
  
@iamhumanbeing did you compare it with zstd? Based on my experience, zstd 
is way better than igzip. I would expect a similar result with ISA-L. It 
doesn't seem to be adding a lot of value if we plan to support zstd in near 
future.


> ZLIB performance
> 
>
> Key: ORC-175
> URL: https://issues.apache.org/jira/browse/ORC-175
> Project: ORC
>  Issue Type: Improvement
>  Components: Java
>Reporter: iamhumanbeing
>Assignee: iamhumanbeing
>  Labels: performance
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ORC-175) ZLIB performance

2017-09-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16152021#comment-16152021
 ] 

ASF GitHub Bot commented on ORC-175:


Github user iamhumanbeing commented on the issue:

https://github.com/apache/orc/pull/159
  
@omalley: How about we add a compiling option which use ISA-L ZLIB support 
for ZLIB compression&decompression? this option is just an optimization on 
performance. 


> ZLIB performance
> 
>
> Key: ORC-175
> URL: https://issues.apache.org/jira/browse/ORC-175
> Project: ORC
>  Issue Type: Improvement
>  Components: Java
>Reporter: iamhumanbeing
>Assignee: iamhumanbeing
>  Labels: performance
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ORC-175) ZLIB performance

2017-08-29 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16146185#comment-16146185
 ] 

Gopal V commented on ORC-175:
-

The fact that the underlying codec is compatible means that we should be able 
to extend the decompression speedups to all users who already use Zlib for 
their storage - which is a big advantage if it can be pulled off.

If Hadoop's ZlibCodec could be built to work against ISA-L, then all of the 
projects would get ISA-L support at the same time.

That might be much easier than trying to build it into ORC alone, since 
hadoop-common already has a partial dependency on ISA-L and JNI code which 
depends on it.

https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/CMakeLists.txt#L126

{code}
find_library(ISAL_LIBRARY
NAMES isal
 PATHS ${CUSTOM_ISAL_PREFIX} ${CUSTOM_ISAL_PREFIX}/lib
  ${CUSTOM_ISAL_PREFIX}/lib64 ${CUSTOM_ISAL_LIB} /usr/lib)
{code}

> ZLIB performance
> 
>
> Key: ORC-175
> URL: https://issues.apache.org/jira/browse/ORC-175
> Project: ORC
>  Issue Type: Improvement
>  Components: Java
>Reporter: iamhumanbeing
>Assignee: iamhumanbeing
>  Labels: performance
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ORC-175) ZLIB performance

2017-08-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16146136#comment-16146136
 ] 

ASF GitHub Bot commented on ORC-175:


Github user omalley commented on the issue:

https://github.com/apache/orc/pull/159
  
Is the ISA-L zlib support sufficient to read and write the ORC files with 
zlib compression? I agree with Gopal that it doesn't feel like a separate 
compression codec.

I'm don't think it is a good idea to build in support for proprietary 
compression formats. If it is just an optimization on performance, that might 
be workable. As a unique codec, it isn't.


> ZLIB performance
> 
>
> Key: ORC-175
> URL: https://issues.apache.org/jira/browse/ORC-175
> Project: ORC
>  Issue Type: Improvement
>  Components: Java
>Reporter: iamhumanbeing
>Assignee: iamhumanbeing
>  Labels: performance
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ORC-175) ZLIB performance

2017-08-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16139500#comment-16139500
 ] 

ASF GitHub Bot commented on ORC-175:


Github user 10110346 commented on the issue:

https://github.com/apache/orc/pull/159
  
LGTM


> ZLIB performance
> 
>
> Key: ORC-175
> URL: https://issues.apache.org/jira/browse/ORC-175
> Project: ORC
>  Issue Type: Improvement
>  Components: Java
>Reporter: iamhumanbeing
>Assignee: iamhumanbeing
>  Labels: performance
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ORC-175) ZLIB performance

2017-08-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16137820#comment-16137820
 ] 

ASF GitHub Bot commented on ORC-175:


Github user iamhumanbeing commented on a diff in the pull request:

https://github.com/apache/orc/pull/159#discussion_r134650087
  
--- Diff: java/bench/src/java/org/apache/orc/bench/CompressionKind.java ---
@@ -53,6 +54,8 @@ public OutputStream create(OutputStream out) throws 
IOException {
 return new GZIPOutputStream(out);
   case SNAPPY:
 return new SnappyCodec().createOutputStream(out);
+  case ISAL:
--- End diff --

1. igzip is only part of the ISAL.
2. igzip only support level 0 - level 1 compression, it can not replace 
libz totally.
3. igzip's API can not replace ligz's API directly. 


> ZLIB performance
> 
>
> Key: ORC-175
> URL: https://issues.apache.org/jira/browse/ORC-175
> Project: ORC
>  Issue Type: Improvement
>  Components: Java
>Reporter: iamhumanbeing
>Assignee: iamhumanbeing
>  Labels: performance
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ORC-175) ZLIB performance

2017-08-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136559#comment-16136559
 ] 

ASF GitHub Bot commented on ORC-175:


Github user t3rmin4t0r commented on a diff in the pull request:

https://github.com/apache/orc/pull/159#discussion_r134427978
  
--- Diff: java/bench/src/java/org/apache/orc/bench/CompressionKind.java ---
@@ -53,6 +54,8 @@ public OutputStream create(OutputStream out) throws 
IOException {
 return new GZIPOutputStream(out);
   case SNAPPY:
 return new SnappyCodec().createOutputStream(out);
+  case ISAL:
--- End diff --

ISAL is just the library for gzip? That doesn't need a new codec - have you 
tried LD_PRELOAD to load ISAL instead of libz?


> ZLIB performance
> 
>
> Key: ORC-175
> URL: https://issues.apache.org/jira/browse/ORC-175
> Project: ORC
>  Issue Type: Improvement
>  Components: Java
>Reporter: iamhumanbeing
>Assignee: iamhumanbeing
>  Labels: performance
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ORC-175) ZLIB performance

2017-08-22 Thread iamhumanbeing (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136466#comment-16136466
 ] 

iamhumanbeing commented on ORC-175:
---

https://github.com/apache/orc/pull/159

> ZLIB performance
> 
>
> Key: ORC-175
> URL: https://issues.apache.org/jira/browse/ORC-175
> Project: ORC
>  Issue Type: Improvement
>  Components: Java
>Reporter: iamhumanbeing
>Assignee: iamhumanbeing
>  Labels: performance
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ORC-175) ZLIB performance

2017-08-16 Thread iamhumanbeing (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16128514#comment-16128514
 ] 

iamhumanbeing commented on ORC-175:
---

[~owen.omalley]

> ZLIB performance
> 
>
> Key: ORC-175
> URL: https://issues.apache.org/jira/browse/ORC-175
> Project: ORC
>  Issue Type: Improvement
>  Components: Java
>Reporter: iamhumanbeing
>Assignee: iamhumanbeing
>  Labels: performance
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ORC-175) ZLIB performance

2017-08-15 Thread iamhumanbeing (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126936#comment-16126936
 ] 

iamhumanbeing commented on ORC-175:
---

[~gopalv]:got orc-benchmark data for ISA-L 
ColumnProjectionBenchmark.orc   isal   taxi  avgt3   944503.179 卤  
43834.261  us/op
ColumnProjectionBenchmark.orc   zlib   taxi  avgt3  1029682.551 卤  
26364.565  us/op

FullReadBenchmark.orc  isal   taxi  avgt3 14192224.371 卤   
180230.436  us/op
FullReadBenchmark.orc  zlib   taxi  avgt3 16234465.657 卤   
415264.953  us/op


seems 8%-14% speedups

> ZLIB performance
> 
>
> Key: ORC-175
> URL: https://issues.apache.org/jira/browse/ORC-175
> Project: ORC
>  Issue Type: Improvement
>  Components: Java
>Reporter: iamhumanbeing
>Assignee: iamhumanbeing
>  Labels: performance
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ORC-175) ZLIB performance

2017-08-09 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119622#comment-16119622
 ] 

Gopal V commented on ORC-175:
-

igzip is not testing the same codepath as ORC, because ORC does use a mix of 
LZ77 widths and tries to use faster decode loops, at lower compression levels 
(it uses different zlib combinations for different column types).

https://github.com/apache/orc/blob/master/java/core/src/java/org/apache/orc/impl/ZlibCodec.java#L138

Does ISA-L offer any meaningful speedups for those combinations of 
Zlib-compatible algorithms?

> ZLIB performance
> 
>
> Key: ORC-175
> URL: https://issues.apache.org/jira/browse/ORC-175
> Project: ORC
>  Issue Type: Improvement
>  Components: Java
>Reporter: iamhumanbeing
>Assignee: iamhumanbeing
>  Labels: performance
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ORC-175) ZLIB performance

2017-08-09 Thread iamhumanbeing (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119607#comment-16119607
 ] 

iamhumanbeing commented on ORC-175:
---

Gopal V:inflate performance--》 ZLIB:197MB/s;ISA-L:356MB/s

perf@perf-master:~/git/isa-l/igzip$ ./igzip_inflate_perf part-00127 
isal_inflate_perf: 
Using igzip compression
igzip_zlib_inflate_perf: part-00127 215 iterations
  file part-00127 - in_size=4632710 out_size=4632710 iter=215
igzip_file: runtime =5053893 usecs, bandwidth 949 MB in 5.0539 sec = 197.08 
MB/s
End of igzip_zlib_inflate_perf

isal_inflate_stateless_perf: part-00127 215 iterations
  file part-00127 - in_size=4632710 out_size=4632710 iter=215
igzip_file: runtime =2794267 usecs, bandwidth 949 MB in 2.7943 sec = 356.46 
MB/s
End of isal_inflate_stateless_perf


> ZLIB performance
> 
>
> Key: ORC-175
> URL: https://issues.apache.org/jira/browse/ORC-175
> Project: ORC
>  Issue Type: Improvement
>  Components: Java
>Reporter: iamhumanbeing
>Assignee: iamhumanbeing
>  Labels: performance
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ORC-175) ZLIB performance

2017-08-08 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119392#comment-16119392
 ] 

Gopal V commented on ORC-175:
-

[~iamhumanbeing]: the really interesting part is not the compressor, but the 
inflate_fast() performance - do you have any numbers on the throughput gains?

> ZLIB performance
> 
>
> Key: ORC-175
> URL: https://issues.apache.org/jira/browse/ORC-175
> Project: ORC
>  Issue Type: Improvement
>  Components: Java
>Reporter: iamhumanbeing
>Assignee: iamhumanbeing
>  Labels: performance
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ORC-175) ZLIB performance

2017-08-08 Thread iamhumanbeing (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119368#comment-16119368
 ] 

iamhumanbeing commented on ORC-175:
---

 the ISA-L API is compatible with ZLIB C API。We have tested it on Hive 
1.2.1。For “insert table” SQL, 10% performance improvement。

> ZLIB performance
> 
>
> Key: ORC-175
> URL: https://issues.apache.org/jira/browse/ORC-175
> Project: ORC
>  Issue Type: Improvement
>  Components: Java
>Reporter: iamhumanbeing
>Assignee: iamhumanbeing
>  Labels: performance
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ORC-175) ZLIB performance

2017-08-07 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116796#comment-16116796
 ] 

Owen O'Malley commented on ORC-175:
---

Is the ISA-L API-compatible with zlib or is the change bigger than that?

> ZLIB performance
> 
>
> Key: ORC-175
> URL: https://issues.apache.org/jira/browse/ORC-175
> Project: ORC
>  Issue Type: Improvement
>  Components: Java
>Reporter: iamhumanbeing
>Assignee: iamhumanbeing
>  Labels: performance
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)