[jira] [Commented] (SPARK-20241) java.lang.IllegalArgumentException: Can not set final [B field org.codehaus.janino.util.ClassFile$CodeAttribute.code to org.codehaus.janino.util.ClassFile$CodeAttribut

2017-04-06 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15959495#comment-15959495 ] Kazuaki Ishizaki commented on SPARK-20241: -- I think that this problem is ca

[jira] [Commented] (SPARK-20210) Scala tests aborted in Spark SQL on ppc64le

2017-04-05 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15956436#comment-15956436 ] Kazuaki Ishizaki commented on SPARK-20210: -- I run the following two t

[jira] [Commented] (SPARK-20184) performance regression for complex/long sql when enable whole stage codegen

2017-04-05 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15956420#comment-15956420 ] Kazuaki Ishizaki commented on SPARK-20184: -- If the number of rows are sma

Re: how do i force unit test to do whole stage codegen

2017-04-04 Thread Kazuaki Ishizaki
opic for d...@spark.apache.org. Kazuaki Ishizaki From: Koert Kuipers To: "user@spark.apache.org" Date: 2017/04/05 05:12 Subject:how do i force unit test to do whole stage codegen i wrote my own expression with eval and doGenCode, but doGenCode never gets called in test

[jira] [Commented] (SPARK-20176) Spark Dataframe UDAF issue

2017-04-03 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15954093#comment-15954093 ] Kazuaki Ishizaki commented on SPARK-20176: -- Thanks. The code seem to work

[jira] [Comment Edited] (SPARK-20176) Spark Dataframe UDAF issue

2017-04-03 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15954093#comment-15954093 ] Kazuaki Ishizaki edited comment on SPARK-20176 at 4/3/17 8:1

Re: [VOTE] Apache Spark 2.1.1 (RC2)

2017-04-02 Thread Kazuaki Ishizaki
Thank you. Yes, it is not a regression. 2.1.0 would have this failure, too. Regards, Kazuaki Ishizaki From: Sean Owen To: Kazuaki Ishizaki/Japan/IBM@IBMJP, Michael Armbrust Cc: "dev@spark.apache.org" Date: 2017/04/02 18:18 Subject:Re: [VOTE] Apache Spark

Re: [VOTE] Apache Spark 2.1.1 (RC2)

2017-04-02 Thread Kazuaki Ishizaki
0.002 sec <<< ERROR! java.lang.IllegalArgumentException: requirement failed: No support for unaligned Unsafe. Set spark.memory.offHeap.enabled to false. ... Tests run: 207, Failures: 7, Errors: 16, Skipped: 0 Kazuaki Ishizaki From: Michael Armbrust To: "dev@spark.apache.org

[jira] [Commented] (SPARK-20176) Spark Dataframe UDAF issue

2017-03-31 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15951473#comment-15951473 ] Kazuaki Ishizaki commented on SPARK-20176: -- Could you please post the pro

[jira] [Comment Edited] (SPARK-20158) crash in Spark sql insert in partitioned hive tables

2017-03-30 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949364#comment-15949364 ] Kazuaki Ishizaki edited comment on SPARK-20158 at 3/30/17 4:3

[jira] [Commented] (SPARK-20158) crash in Spark sql insert in partitioned hive tables

2017-03-30 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949364#comment-15949364 ] Kazuaki Ishizaki commented on SPARK-20158: -- What program did you run? >

[jira] [Comment Edited] (SPARK-19984) ERROR codegen.CodeGenerator: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java'

2017-03-30 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949359#comment-15949359 ] Kazuaki Ishizaki edited comment on SPARK-19984 at 3/30/17 4:3

[jira] [Commented] (SPARK-19984) ERROR codegen.CodeGenerator: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java'

2017-03-30 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949359#comment-15949359 ] Kazuaki Ishizaki commented on SPARK-19984: -- While I tried a reproduc

[jira] [Commented] (SPARK-20112) SIGSEGV in GeneratedIterator.sort_addToSorter

2017-03-28 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1594#comment-1594 ] Kazuaki Ishizaki commented on SPARK-20112: -- [~MasterDDT] Thank you

[jira] [Commented] (SPARK-20112) SIGSEGV in GeneratedIterator.sort_addToSorter

2017-03-28 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15945389#comment-15945389 ] Kazuaki Ishizaki commented on SPARK-20112: -- SPARK-18745 fixed integer over

[jira] [Created] (SPARK-20101) Use OffHeapColumnVector when "spark.memory.offHeap.enabled" is set to "true"

2017-03-26 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-20101: Summary: Use OffHeapColumnVector when "spark.memory.offHeap.enabled" is set to "true" Key: SPARK-20101 URL: https://issues.apache.org/jir

[jira] [Commented] (SPARK-19372) Code generation for Filter predicate including many OR conditions exceeds JVM method size limit

2017-03-23 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939511#comment-15939511 ] Kazuaki Ishizaki commented on SPARK-19372: -- I implemented the code to take

[jira] [Commented] (SPARK-14083) Analyze JVM bytecode and turn closures into Catalyst expressions

2017-03-23 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15937907#comment-15937907 ] Kazuaki Ishizaki commented on SPARK-14083: -- I agree with you. I do not t

[jira] [Commented] (SPARK-14083) Analyze JVM bytecode and turn closures into Catalyst expressions

2017-03-23 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15937886#comment-15937886 ] Kazuaki Ishizaki commented on SPARK-14083: -- [~viirya] For a while, I wil

[jira] [Commented] (SPARK-14083) Analyze JVM bytecode and turn closures into Catalyst expressions

2017-03-23 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15937883#comment-15937883 ] Kazuaki Ishizaki commented on SPARK-14083: -- [~maropu] Thanks. > Anal

[jira] [Commented] (SPARK-14083) Analyze JVM bytecode and turn closures into Catalyst expressions

2017-03-22 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15937783#comment-15937783 ] Kazuaki Ishizaki commented on SPARK-14083: -- [~viirya] Thank you for

[jira] [Updated] (SPARK-20046) Facilitate loop optimizations in a JIT compiler regarding sqlContext.read.parquet()

2017-03-21 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-20046: - Issue Type: Improvement (was: Bug) > Facilitate loop optimizations in a JIT compi

[jira] [Created] (SPARK-20046) Facilitate loop optimizations in a JIT compiler regarding sqlContext.read.parquet()

2017-03-21 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-20046: Summary: Facilitate loop optimizations in a JIT compiler regarding sqlContext.read.parquet() Key: SPARK-20046 URL: https://issues.apache.org/jira/browse/SPARK-20046

Re: Why are DataFrames always read with nullable=True?

2017-03-20 Thread Kazuaki Ishizaki
null exists in non-null column. Any comments are appreciated. Kazuaki Ishizaki From: Jason White To: dev@spark.apache.org Date: 2017/03/21 06:31 Subject:Why are DataFrames always read with nullable=True? If I create a dataframe in Spark with non-nullable columns, and then save

Re: [Spark SQL & Core]: RDD to Dataset 1500 columns data with createDataFrame() throw exception of grows beyond 64 KB

2017-03-18 Thread Kazuaki Ishizaki
org.codehaus.janino.util.ClassFile.addConstantNameAndTypeInfo(ClassFile.java:439) at org.codehaus.janino.util.ClassFile.addConstantMethodrefInfo(ClassFile.java:358) ... While this PR https://github.com/apache/spark/pull/16648 addresses the number of the constant pool issue, it has not been merged yet. Regards, Kazuaki Ishizaki

[jira] [Comment Edited] (SPARK-19984) ERROR codegen.CodeGenerator: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java'

2017-03-16 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15929399#comment-15929399 ] Kazuaki Ishizaki edited comment on SPARK-19984 at 3/17/17 4:2

[jira] [Commented] (SPARK-19984) ERROR codegen.CodeGenerator: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java'

2017-03-16 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15929399#comment-15929399 ] Kazuaki Ishizaki commented on SPARK-19984: -- This problem occurs since S

[jira] [Created] (SPARK-19959) df[java.lang.Long].collect throws NullPointerException if df includes null

2017-03-15 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-19959: Summary: df[java.lang.Long].collect throws NullPointerException if df includes null Key: SPARK-19959 URL: https://issues.apache.org/jira/browse/SPARK-19959

[jira] [Commented] (SPARK-19950) nullable ignored when df.load() is executed for file-based data source

2017-03-14 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15925357#comment-15925357 ] Kazuaki Ishizaki commented on SPARK-19950: -- [~hyukjin.kwon] Thank you

[jira] [Created] (SPARK-19950) nullable ignored when df.load() is executed for file-based data source

2017-03-14 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-19950: Summary: nullable ignored when df.load() is executed for file-based data source Key: SPARK-19950 URL: https://issues.apache.org/jira/browse/SPARK-19950

[jira] [Commented] (SPARK-14083) Analyze JVM bytecode and turn closures into Catalyst expressions

2017-03-09 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15904460#comment-15904460 ] Kazuaki Ishizaki commented on SPARK-14083: -- I rebased this with master: h

[jira] [Commented] (SPARK-19875) Map->filter on many columns gets stuck in constraint inference optimization code

2017-03-09 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15903584#comment-15903584 ] Kazuaki Ishizaki commented on SPARK-19875: -- I got the following stack t

[jira] [Commented] (SPARK-14083) Analyze JVM bytecode and turn closures into Catalyst expressions

2017-03-05 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15896452#comment-15896452 ] Kazuaki Ishizaki commented on SPARK-14083: -- Does anyone go forward with

[jira] [Commented] (SPARK-19503) Execution Plan Optimizer: avoid sort or shuffle when it does not change end result such as df.sort(...).count()

2017-03-04 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15895740#comment-15895740 ] Kazuaki Ishizaki commented on SPARK-19503: -- Is it better to control whethe

[jira] [Commented] (SPARK-19503) Execution Plan Optimizer: avoid sort or shuffle when it does not change end result such as df.sort(...).count()

2017-03-01 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15891699#comment-15891699 ] Kazuaki Ishizaki commented on SPARK-19503: -- If it is good to leave sort in

[jira] [Commented] (SPARK-19468) Dataset slow because of unnecessary shuffles

2017-03-01 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15891658#comment-15891658 ] Kazuaki Ishizaki commented on SPARK-19468: -- Interesting. For {{val joi

[jira] [Comment Edited] (SPARK-19741) ClassCastException when using Dataset with type containing value types

2017-03-01 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15891548#comment-15891548 ] Kazuaki Ishizaki edited comment on SPARK-19741 at 3/2/17 3:1

[jira] [Commented] (SPARK-19741) ClassCastException when using Dataset with type containing value types

2017-03-01 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15891548#comment-15891548 ] Kazuaki Ishizaki commented on SPARK-19741: -- I am afraid whether my sa

[jira] [Commented] (SPARK-19741) ClassCastException when using Dataset with type containing value types

2017-03-01 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15890712#comment-15890712 ] Kazuaki Ishizaki commented on SPARK-19741: -- The following program cause

[jira] [Updated] (SPARK-19786) Facilitate loop optimizations in a JIT compiler regarding range()

2017-03-01 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-19786: - Summary: Facilitate loop optimizations in a JIT compiler regarding range() (was

[jira] [Created] (SPARK-19786) Facilitate loop optimization in a JIT compiler regarding range()

2017-03-01 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-19786: Summary: Facilitate loop optimization in a JIT compiler regarding range() Key: SPARK-19786 URL: https://issues.apache.org/jira/browse/SPARK-19786 Project

[jira] [Commented] (SPARK-19741) ClassCastException when using Dataset with type containing value types

2017-02-26 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15885171#comment-15885171 ] Kazuaki Ishizaki commented on SPARK-19741: -- Would it be possible to attac

[jira] [Commented] (SPARK-15678) Not use cache on appends and overwrites

2017-02-24 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883035#comment-15883035 ] Kazuaki Ishizaki commented on SPARK-15678: -- Sorry for being late to r

[jira] [Commented] (SPARK-15678) Not use cache on appends and overwrites

2017-02-21 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877575#comment-15877575 ] Kazuaki Ishizaki commented on SPARK-15678: -- How about in

[jira] [Comment Edited] (SPARK-15678) Not use cache on appends and overwrites

2017-02-21 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877575#comment-15877575 ] Kazuaki Ishizaki edited comment on SPARK-15678 at 2/22/17 6:3

Re: A DataFrame cache bug

2017-02-21 Thread Kazuaki Ishizaki
t;id>10") return correct result. f(df1).count // output 89 which is incorrect Regards, Kazuaki Ishizaki From: gen tang To: dev@spark.apache.org Date: 2017/02/22 15:02 Subject:Re: A DataFrame cache bug Hi All, I might find a related issue on jira: https://issues.apa

[jira] [Commented] (SPARK-19653) `Vector` Type Should Be A First-Class Citizen In Spark SQL

2017-02-17 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872901#comment-15872901 ] Kazuaki Ishizaki commented on SPARK-19653: -- cc: [~cloud_fan] > `Vecto

Re: welcoming Takuya Ueshin as a new Apache Spark committer

2017-02-13 Thread Kazuaki Ishizaki
Congrats! Kazuaki Ishizaki From: Reynold Xin To: "dev@spark.apache.org" Date: 2017/02/14 04:18 Subject:welcoming Takuya Ueshin as a new Apache Spark committer Hi all, Takuya-san has recently been elected an Apache Spark committer. He's been active in t

[jira] [Resolved] (SPARK-16043) Prepare GenericArrayData implementation specialized for a primitive array

2017-02-03 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki resolved SPARK-16043. -- Resolution: Fixed Fix Version/s: 2.2.0 > Prepare GenericArrayD

[jira] [Commented] (SPARK-19372) Code generation for Filter predicate including many OR conditions exceeds JVM method size limit

2017-02-03 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15851472#comment-15851472 ] Kazuaki Ishizaki commented on SPARK-19372: -- I was able to reproduce this.

Re: Spark performance tests

2017-01-10 Thread Kazuaki Ishizaki
Hi, You may find several micro-benchmarks under https://github.com/apache/spark/tree/master/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark . Regards, Kazuaki Ishizaki From: Prasun Ratn To: Apache Spark Dev Date: 2017/01/10 12:52 Subject:Spark performance

Re: Quick request: prolific PR openers, review your open PRs

2017-01-08 Thread Kazuaki Ishizaki
Sure, I updated status of some PRs. Regards, Kazuaki Ishizaki From: Sean Owen To: dev Date: 2017/01/04 21:37 Subject:Quick request: prolific PR openers, review your open PRs Just saw that there are many people with >= 8 open PRs. Some are legitimately in flight but m

[jira] [Commented] (SPARK-19008) Avoid boxing/unboxing overhead of calling a lambda with primitive type from Dataset program

2016-12-26 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15779544#comment-15779544 ] Kazuaki Ishizaki commented on SPARK-19008: -- I will work for this >

[jira] [Updated] (SPARK-19008) Avoid boxing/unboxing overhead of calling a lambda with primitive type from Dataset program

2016-12-26 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-19008: - Description: In a [discussion|https://github.com/apache/spark/pull/16391

[jira] [Created] (SPARK-19008) Avoid boxing/unboxing overhead of calling a lambda with primitive type from Dataset program

2016-12-26 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-19008: Summary: Avoid boxing/unboxing overhead of calling a lambda with primitive type from Dataset program Key: SPARK-19008 URL: https://issues.apache.org/jira/browse/SPARK

[jira] [Commented] (SPARK-14083) Analyze JVM bytecode and turn closures into Catalyst expressions

2016-12-26 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15779288#comment-15779288 ] Kazuaki Ishizaki commented on SPARK-14083: -- [Here|https://github.com/ap

Sharing data in columnar storage between two applications

2016-12-25 Thread Kazuaki Ishizaki
both applications supports Apache Arrow APIs. Other approaches could be. What approach would be good for all of applications? Regards, Kazuaki Ishizaki

[jira] [Commented] (SPARK-18859) Catalyst codegen does not mark column as nullable when it should. Causes NPE

2016-12-20 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15764789#comment-15764789 ] Kazuaki Ishizaki commented on SPARK-18859: -- I think that this is an issu

[jira] [Commented] (SPARK-18814) CheckAnalysis rejects TPCDS query 32

2016-12-12 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15742828#comment-15742828 ] Kazuaki Ishizaki commented on SPARK-18814: -- I found the same e

[jira] [Commented] (SPARK-16073) Performance of Parquet encodings on saving primitive arrays

2016-12-11 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15740807#comment-15740807 ] Kazuaki Ishizaki commented on SPARK-16073: -- It is an interesting topic. In

[jira] [Comment Edited] (SPARK-18745) java.lang.IndexOutOfBoundsException running query 68 Spark SQL on (100TB)

2016-12-09 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15735580#comment-15735580 ] Kazuaki Ishizaki edited comment on SPARK-18745 at 12/9/16 3:2

[jira] [Comment Edited] (SPARK-18745) java.lang.IndexOutOfBoundsException running query 68 Spark SQL on (100TB)

2016-12-09 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15735580#comment-15735580 ] Kazuaki Ishizaki edited comment on SPARK-18745 at 12/9/16 3:2

[jira] [Commented] (SPARK-18745) java.lang.IndexOutOfBoundsException running query 68 Spark SQL on (100TB)

2016-12-09 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15735580#comment-15735580 ] Kazuaki Ishizaki commented on SPARK-18745: -- I identified a root cause of

Re: Reduce memory usage of UnsafeInMemorySorter

2016-12-08 Thread Kazuaki Ishizaki
The line where I pointed out would work correctly. This is because a type of this division is double. d2i correctly handles overflow cases. Kazuaki Ishizaki From: Nicholas Chammas To: Kazuaki Ishizaki/Japan/IBM@IBMJP, Reynold Xin Cc: Spark dev list Date: 2016/12/08 10:56

Re: Reduce memory usage of UnsafeInMemorySorter

2016-12-07 Thread Kazuaki Ishizaki
org/apache/spark/util/collection/unsafe/sort/UnsafeInMemorySorter.java#L156 Regards, Kazuaki Ishizaki From: Reynold Xin To: Nicholas Chammas Cc: Spark dev list Date: 2016/12/07 14:27 Subject:Re: Reduce memory usage of UnsafeInMemorySorter This is not supposed to happen. Do

[jira] [Updated] (SPARK-18745) java.lang.IndexOutOfBoundsException running query 68 Spark SQL on (100TB)

2016-12-06 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-18745: - Affects Version/s: 2.2.0 > java.lang.IndexOutOfBoundsException running query 68 Sp

[jira] [Commented] (SPARK-18745) java.lang.IndexOutOfBoundsException running query 68 Spark SQL on (100TB)

2016-12-06 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15726486#comment-15726486 ] Kazuaki Ishizaki commented on SPARK-18745: -- I work with [~jfc...@us.ibm

[jira] [Created] (SPARK-18653) Dataset.show() generates incorrect padding for Unicode Character

2016-11-30 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-18653: Summary: Dataset.show() generates incorrect padding for Unicode Character Key: SPARK-18653 URL: https://issues.apache.org/jira/browse/SPARK-18653 Project

[jira] [Commented] (SPARK-17680) Unicode Character Support for Column Names and Comments

2016-11-29 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707629#comment-15707629 ] Kazuaki Ishizaki commented on SPARK-17680: -- Sorry, it is my mistake. >

[jira] [Commented] (SPARK-18502) Spark does not handle columns that contain backquote (`)

2016-11-29 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706385#comment-15706385 ] Kazuaki Ishizaki commented on SPARK-18502: -- I can reproduce this excep

[jira] [Commented] (SPARK-18492) GeneratedIterator grows beyond 64 KB

2016-11-28 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15702760#comment-15702760 ] Kazuaki Ishizaki commented on SPARK-18492: -- I realized that the following

[jira] [Resolved] (SPARK-15950) Eliminate unreachable code at projection for complex types

2016-11-27 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki resolved SPARK-15950. -- Resolution: Duplicate > Eliminate unreachable code at projection for complex ty

Re: Linear regression + Janino Exception

2016-11-21 Thread Kazuaki Ishizaki
Thank you for reporting the error. I think that this is associated to https://issues.apache.org/jira/browse/SPARK-18492 The reporter of this JIRA entry has not posted the program yet. Would it be possible to add your program that can reproduce this issue to this JIRA entry? Regards, Kazuaki

[jira] [Comment Edited] (SPARK-18492) GeneratedIterator grows beyond 64 KB

2016-11-17 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15674599#comment-15674599 ] Kazuaki Ishizaki edited comment on SPARK-18492 at 11/17/16 7:3

[jira] [Commented] (SPARK-18492) GeneratedIterator grows beyond 64 KB

2016-11-17 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15674599#comment-15674599 ] Kazuaki Ishizaki commented on SPARK-18492: -- Can you post a small program

[jira] [Commented] (SPARK-18458) core dumped running Spark SQL on large data volume (100TB)

2016-11-16 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15671237#comment-15671237 ] Kazuaki Ishizaki commented on SPARK-18458: -- I worked with [~jfc...@us.ibm

[jira] [Issue Comment Deleted] (SPARK-18458) core dumped running Spark SQL on large data volume (100TB)

2016-11-16 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-18458: - Comment: was deleted (was: I worked with [~jfc...@us.ibm.com]. Then, I identified that a

[jira] [Commented] (SPARK-18458) core dumped running Spark SQL on large data volume (100TB)

2016-11-16 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15671229#comment-15671229 ] Kazuaki Ishizaki commented on SPARK-18458: -- I worked with [~jfc...@us.ibm

[jira] [Commented] (SPARK-18458) core dumped running Spark SQL on large data volume (100TB)

2016-11-16 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15670828#comment-15670828 ] Kazuaki Ishizaki commented on SPARK-18458: -- I see. I will do that. &g

[jira] [Commented] (SPARK-18458) core dumped running Spark SQL on large data volume (100TB)

2016-11-15 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15668948#comment-15668948 ] Kazuaki Ishizaki commented on SPARK-18458: -- I work for this. > core

[jira] [Updated] (SPARK-18284) Scheme of DataFrame generated from RDD is diffrent between master and 2.0

2016-11-04 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-18284: - Description: When the following program is executed, a schema of dataframe is different

[jira] [Updated] (SPARK-18284) Scheme of DataFrame generated from RDD is diffrent between master and 2.0

2016-11-04 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-18284: - Affects Version/s: 2.1.0 > Scheme of DataFrame generated from RDD is diffrent betw

[jira] [Created] (SPARK-18284) Scheme of DataFrame generated from RDD is diffrent between master and 2.0

2016-11-04 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-18284: Summary: Scheme of DataFrame generated from RDD is diffrent between master and 2.0 Key: SPARK-18284 URL: https://issues.apache.org/jira/browse/SPARK-18284

[jira] [Commented] (SPARK-18207) class "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection" grows beyond 64 KB

2016-11-02 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15629522#comment-15629522 ] Kazuaki Ishizaki commented on SPARK-18207: -- I created a smaller progra

[jira] [Commented] (SPARK-18125) Spark generated code causes CompileException when groupByKey, reduceGroups and map(_._2) are used

2016-10-28 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15615034#comment-15615034 ] Kazuaki Ishizaki commented on SPARK-18125: -- I confirmed this code can repro

[jira] [Commented] (SPARK-18147) Broken Spark SQL Codegen

2016-10-28 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15614959#comment-15614959 ] Kazuaki Ishizaki commented on SPARK-18147: -- This also cause the same excep

Re: Spark SQL is slower when DataFrame is cache in Memory

2016-10-27 Thread Kazuaki Ishizaki
Hi Chin Wei, Thank you for confirming this on 2.0.1 and being happy to hear it never happens. The performance will be improved when this PR ( https://github.com/apache/spark/pull/15219) is integrated. Regards, Kazuaki Ishizaki From: Chin Wei Low To: Kazuaki Ishizaki/Japan/IBM@IBMJP Cc

Re: [Spark 2.0.1] Error in generated code, possible regression?

2016-10-24 Thread Kazuaki Ishizaki
Can you have a smaller program that can reproduce the same error? If you also create a JIRA entry, it would be great. Kazuaki Ishizaki From: Efe Selcuk To: "user @spark" Date: 2016/10/25 10:23 Subject:[Spark 2.0.1] Error in generated code, possible regression?

Re: Spark SQL is slower when DataFrame is cache in Memory

2016-10-24 Thread Kazuaki Ishizaki
Hi Chin Wei, I am sorry for being late to reply. Got it. Interesting behavior. How did you measure the time between 1st and 2nd events? Best Regards, Kazuaki Ishizaki From: Chin Wei Low To: Kazuaki Ishizaki/Japan/IBM@IBMJP Cc: user@spark.apache.org Date: 2016/10/10 11:33 Subject

[jira] [Commented] (SPARK-15687) Columnar execution engine

2016-10-19 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15588641#comment-15588641 ] Kazuaki Ishizaki commented on SPARK-15687: -- [#15219|https://github.com/ap

[jira] [Created] (SPARK-17915) Prepare ColumnVector implementation for UnsafeData

2016-10-13 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-17915: Summary: Prepare ColumnVector implementation for UnsafeData Key: SPARK-17915 URL: https://issues.apache.org/jira/browse/SPARK-17915 Project: Spark

[jira] [Created] (SPARK-17912) Refactor code generation to get data for ColumnVector/ColumnarBatch

2016-10-13 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-17912: Summary: Refactor code generation to get data for ColumnVector/ColumnarBatch Key: SPARK-17912 URL: https://issues.apache.org/jira/browse/SPARK-17912 Project

[jira] [Created] (SPARK-17905) Added test cases for InMemoryRelation

2016-10-13 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-17905: Summary: Added test cases for InMemoryRelation Key: SPARK-17905 URL: https://issues.apache.org/jira/browse/SPARK-17905 Project: Spark Issue Type

[jira] [Commented] (SPARK-16845) org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB

2016-10-12 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569493#comment-15569493 ] Kazuaki Ishizaki commented on SPARK-16845: -- Thank you for preparing the cas

[jira] [Issue Comment Deleted] (SPARK-16845) org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB

2016-10-12 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-16845: - Comment: was deleted (was: Thank you for preparing the case. I noticed that the

[jira] [Commented] (SPARK-16845) org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB

2016-10-12 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569489#comment-15569489 ] Kazuaki Ishizaki commented on SPARK-16845: -- Thank you for preparing the cas

[jira] [Resolved] (SPARK-16223) Codegen failure with a Dataframe program using an array

2016-10-08 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki resolved SPARK-16223. -- Resolution: Fixed > Codegen failure with a Dataframe program using an ar

[jira] [Commented] (SPARK-16223) Codegen failure with a Dataframe program using an array

2016-10-08 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557853#comment-15557853 ] Kazuaki Ishizaki commented on SPARK-16223: -- When I rerun it with {{co

Re: Spark SQL is slower when DataFrame is cache in Memory

2016-10-07 Thread Kazuaki Ishizaki
you use to get data, cache or parquet? val res = sqlContext.sql("table1 union table2 union table3") res.explain(true) res.collect() Do I make some misunderstandings? Best Regards, Kazuaki Ishizaki From: Chin Wei Low To: Kazuaki Ishizaki/Japan/IBM@IBMJP Cc: user@spark.apach

Re: Spark SQL is slower when DataFrame is cache in Memory

2016-10-07 Thread Kazuaki Ishizaki
/spark/pull/15219) is ready for review. It would achieve 1.2x performance improvement for a compressed column and much performance improvement for an uncompressed column. Best Regards, Kazuaki Ishizaki From: Chin Wei Low To: user@spark.apache.org Date: 2016/10/07 13:05 Subject

<    5   6   7   8   9   10   11   12   >