[jira] [Resolved] (SPARK-4315) PySpark pickling of pyspark.sql.Row objects is extremely inefficient

2015-07-08 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-4315. --- Resolution: Fixed Fix Version/s: 1.4.0 1.3.2 PySpark pickling of

[jira] [Commented] (SPARK-5092) Selecting from a nested structure with SparkSQL should return a nested structure

2015-07-08 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619632#comment-14619632 ] Davies Liu commented on SPARK-5092: --- cc [~marmbrus] Selecting from a nested structure

[jira] [Updated] (SPARK-8931) Fallback to interpret mode if failed to compile in codegen

2015-07-08 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-8931: -- Description: And we should not fallback during testing. Fallback to interpret mode if failed to

[jira] [Closed] (SPARK-7507) pyspark.sql.types.StructType and Row should implement __iter__()

2015-07-08 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu closed SPARK-7507. - Resolution: Won't Fix pyspark.sql.types.StructType and Row should implement __iter__()

[jira] [Commented] (SPARK-7507) pyspark.sql.types.StructType and Row should implement __iter__()

2015-07-08 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619609#comment-14619609 ] Davies Liu commented on SPARK-7507: --- For `Row`, it's similar to namedtuple, you can

[jira] [Resolved] (SPARK-8450) PySpark write.parquet raises Unsupported datatype DecimalType()

2015-07-08 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-8450. --- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 7131

[jira] [Commented] (SPARK-8408) Python OR operator is not considered while creating a column of boolean type

2015-07-08 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619622#comment-14619622 ] Davies Liu commented on SPARK-8408: --- In Python, We cannot override `or` `and` `not`, so

[jira] [Resolved] (SPARK-8408) Python OR operator is not considered while creating a column of boolean type

2015-07-08 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-8408. --- Resolution: Fixed Assignee: Davies Liu Fix Version/s: 1.4.1 Python OR operator is

[jira] [Created] (SPARK-8931) Fallback to interpret mode if failed to compile in codegen

2015-07-08 Thread Davies Liu (JIRA)
Davies Liu created SPARK-8931: - Summary: Fallback to interpret mode if failed to compile in codegen Key: SPARK-8931 URL: https://issues.apache.org/jira/browse/SPARK-8931 Project: Spark Issue

[jira] [Resolved] (SPARK-7190) UTF8String backed by binary data

2015-07-08 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-7190. --- Resolution: Fixed UTF8String backed by binary data

[jira] [Resolved] (SPARK-7815) Enable UTF8String to work against memory address directly

2015-07-08 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-7815. --- Resolution: Fixed Enable UTF8String to work against memory address directly

[jira] [Assigned] (SPARK-6573) Convert inbound NaN values as null

2015-07-08 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-6573: - Assignee: Davies Liu Convert inbound NaN values as null --

[jira] [Resolved] (SPARK-7909) spark-ec2 and associated tools not py3 ready

2015-07-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-7909. --- Resolution: Fixed spark-ec2 and associated tools not py3 ready

[jira] [Resolved] (SPARK-6289) PySpark doesn't maintain SQL date Types

2015-07-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-6289. --- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 7301

[jira] [Resolved] (SPARK-7902) SQL UDF doesn't support UDT in PySpark

2015-07-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-7902. --- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 7301

[jira] [Resolved] (SPARK-11804) Exception raise when using Jdbc predicates option in PySpark

2015-11-18 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-11804. Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9791

[jira] [Resolved] (SPARK-11767) Easy to OOM when cache large column

2015-11-17 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-11767. Resolution: Fixed Fix Version/s: 1.6.0 > Easy to OOM when cache large column >

[jira] [Resolved] (SPARK-11016) Spark fails when running with a task that requires a more recent version of RoaringBitmaps

2015-11-17 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-11016. Resolution: Fixed Issue resolved by pull request 9748 [https://github.com/apache/spark/pull/9748]

[jira] [Resolved] (SPARK-11583) Make MapStatus use less memory uage

2015-11-17 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-11583. Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9746

[jira] [Created] (SPARK-11805) SpillableIterator should free the in-memory sorter while spilling

2015-11-17 Thread Davies Liu (JIRA)
Davies Liu created SPARK-11805: -- Summary: SpillableIterator should free the in-memory sorter while spilling Key: SPARK-11805 URL: https://issues.apache.org/jira/browse/SPARK-11805 Project: Spark

[jira] [Resolved] (SPARK-11737) String may not be serialized correctly with Kyro

2015-11-17 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-11737. Resolution: Fixed Fix Version/s: 1.5.2 1.6.0 Issue resolved by pull

[jira] [Updated] (SPARK-11737) String may not be serialized correctly with Kyro

2015-11-17 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-11737: --- Fix Version/s: (was: 1.5.2) 1.5.3 > String may not be serialized correctly

[jira] [Commented] (SPARK-9228) Combine unsafe and codegen into a single option

2015-08-26 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715512#comment-14715512 ] Davies Liu commented on SPARK-9228: --- [~jameszhouyi] unsafe.offHeap is another option

[jira] [Closed] (SPARK-10302) NPE while save a DataFrame as ORC

2015-08-26 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu closed SPARK-10302. -- Resolution: Duplicate Fix Version/s: 1.5.0 NPE while save a DataFrame as ORC

[jira] [Created] (SPARK-10302) NPE while save a DataFrame as ORC

2015-08-26 Thread Davies Liu (JIRA)
Davies Liu created SPARK-10302: -- Summary: NPE while save a DataFrame as ORC Key: SPARK-10302 URL: https://issues.apache.org/jira/browse/SPARK-10302 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-10305) PySpark createDataFrame on list of LabeledPoints fails (regression)

2015-08-26 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-10305. Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 8470

[jira] [Assigned] (SPARK-10321) OrcRelation doesn't override sizeInBytes

2015-08-27 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-10321: -- Assignee: Davies Liu OrcRelation doesn't override sizeInBytes

[jira] [Resolved] (SPARK-10196) Failed to save json data with a decimal type in the schema

2015-08-25 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-10196. Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 8408

[jira] [Commented] (SPARK-10215) Div of Decimal returns null

2015-08-25 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710696#comment-14710696 ] Davies Liu commented on SPARK-10215: I think we have not enough time to figure out

[jira] [Created] (SPARK-10245) SQLContext can't parse literal less than 0.1 ( 0.01)

2015-08-25 Thread Davies Liu (JIRA)
Davies Liu created SPARK-10245: -- Summary: SQLContext can't parse literal less than 0.1 ( 0.01) Key: SPARK-10245 URL: https://issues.apache.org/jira/browse/SPARK-10245 Project: Spark Issue Type:

[jira] [Created] (SPARK-10309) Some tasks failed with Unable to acquire memory

2015-08-26 Thread Davies Liu (JIRA)
Davies Liu created SPARK-10309: -- Summary: Some tasks failed with Unable to acquire memory Key: SPARK-10309 URL: https://issues.apache.org/jira/browse/SPARK-10309 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-10373) Move @since annotator to pyspark to be shared by all components

2015-08-31 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14724009#comment-14724009 ] Davies Liu commented on SPARK-10373: [~mengxr] Do we want to add @since for the MLLib APIs in 1.5

[jira] [Created] (SPARK-10379) UnsafeShuffleExternalSorter should preserve first page

2015-08-31 Thread Davies Liu (JIRA)
Davies Liu created SPARK-10379: -- Summary: UnsafeShuffleExternalSorter should preserve first page Key: SPARK-10379 URL: https://issues.apache.org/jira/browse/SPARK-10379 Project: Spark Issue

[jira] [Created] (SPARK-10403) UnsafeRowSerializer can't work with UnsafeShuffleManager (tungsten-sort)

2015-09-01 Thread Davies Liu (JIRA)
Davies Liu created SPARK-10403: -- Summary: UnsafeRowSerializer can't work with UnsafeShuffleManager (tungsten-sort) Key: SPARK-10403 URL: https://issues.apache.org/jira/browse/SPARK-10403 Project: Spark

[jira] [Updated] (SPARK-10379) UnsafeShuffleExternalSorter should preserve first page

2015-09-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-10379: --- Target Version/s: 1.6.0, 1.5.1 (was: 1.5.0) > UnsafeShuffleExternalSorter should preserve first

[jira] [Resolved] (SPARK-10392) Pyspark - Wrong DateType support on JDBC connection

2015-09-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-10392. Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 8556

[jira] [Resolved] (SPARK-10162) PySpark filters with datetimes mess up when datetimes have timezones.

2015-09-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-10162. Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 8555

[jira] [Created] (SPARK-10404) Worker should terminate previous executor before launch new one

2015-09-01 Thread Davies Liu (JIRA)
Davies Liu created SPARK-10404: -- Summary: Worker should terminate previous executor before launch new one Key: SPARK-10404 URL: https://issues.apache.org/jira/browse/SPARK-10404 Project: Spark

[jira] [Updated] (SPARK-10392) Pyspark - Wrong DateType support on JDBC connection

2015-09-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-10392: --- Fix Version/s: 1.5.1 > Pyspark - Wrong DateType support on JDBC connection >

[jira] [Commented] (SPARK-10434) Parquet compatibility with 1.4 is broken when writing arrays that may contain nulls

2015-09-03 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14729445#comment-14729445 ] Davies Liu commented on SPARK-10434: [~lian cheng] I think it's hard to guarantee forward

[jira] [Updated] (SPARK-10434) Parquet compatibility with 1.4 is broken when writing arrays that may contain nulls

2015-09-03 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-10434: --- Priority: Minor (was: Critical) > Parquet compatibility with 1.4 is broken when writing arrays that

[jira] [Commented] (SPARK-10425) Add a regression test for SPARK-10379

2015-09-03 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14729421#comment-14729421 ] Davies Liu commented on SPARK-10425: [~sowen] Thanks for your comment, The reason that PR didn't have

[jira] [Created] (SPARK-10459) PythonUDF could process UnsafeRow

2015-09-04 Thread Davies Liu (JIRA)
Davies Liu created SPARK-10459: -- Summary: PythonUDF could process UnsafeRow Key: SPARK-10459 URL: https://issues.apache.org/jira/browse/SPARK-10459 Project: Spark Issue Type: Improvement

[jira] [Assigned] (SPARK-8632) Poor Python UDF performance because of RDD caching

2015-09-04 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-8632: - Assignee: Davies Liu > Poor Python UDF performance because of RDD caching >

[jira] [Commented] (SPARK-8632) Poor Python UDF performance because of RDD caching

2015-09-08 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735392#comment-14735392 ] Davies Liu commented on SPARK-8632: --- [~rxin] As [~justin.uang] suggested before, the batch mode will

[jira] [Commented] (SPARK-10309) Some tasks failed with Unable to acquire memory

2015-09-08 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735401#comment-14735401 ] Davies Liu commented on SPARK-10309: [~nadenf] In my case, the job finally finished (after retry), so

[jira] [Commented] (SPARK-8632) Poor Python UDF performance because of RDD caching

2015-09-08 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735644#comment-14735644 ] Davies Liu commented on SPARK-8632: --- The upstream means child of current SparkPlan, could have other

[jira] [Created] (SPARK-10494) Multiple Python UDFs together with aggregation or sort merge join may cause OOM (failed to acquire memory)

2015-09-08 Thread Davies Liu (JIRA)
Davies Liu created SPARK-10494: -- Summary: Multiple Python UDFs together with aggregation or sort merge join may cause OOM (failed to acquire memory) Key: SPARK-10494 URL:

[jira] [Commented] (SPARK-10466) UnsafeRow exception in Sort-Based Shuffle with data spill

2015-09-08 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735776#comment-14735776 ] Davies Liu commented on SPARK-10466: [~chenghao] I tried your test case, it passed in master. Is

[jira] [Commented] (SPARK-10309) Some tasks failed with Unable to acquire memory

2015-09-08 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735903#comment-14735903 ] Davies Liu commented on SPARK-10309: [~nadenf] Could you post the physical plan here? That could help

[jira] [Commented] (SPARK-10309) Some tasks failed with Unable to acquire memory

2015-09-08 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735908#comment-14735908 ] Davies Liu commented on SPARK-10309: This also could be related to

[jira] [Created] (SPARK-10424) ShuffleHashOuterJoin should consider condition

2015-09-02 Thread Davies Liu (JIRA)
Davies Liu created SPARK-10424: -- Summary: ShuffleHashOuterJoin should consider condition Key: SPARK-10424 URL: https://issues.apache.org/jira/browse/SPARK-10424 Project: Spark Issue Type: New

[jira] [Created] (SPARK-10425) Add a regression test for SPARK-10379

2015-09-02 Thread Davies Liu (JIRA)
Davies Liu created SPARK-10425: -- Summary: Add a regression test for SPARK-10379 Key: SPARK-10425 URL: https://issues.apache.org/jira/browse/SPARK-10425 Project: Spark Issue Type: New Feature

[jira] [Updated] (SPARK-10424) ShuffleHashOuterJoin should consider condition

2015-09-02 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-10424: --- Priority: Blocker (was: Major) > ShuffleHashOuterJoin should consider condition >

[jira] [Resolved] (SPARK-10422) String column in InMemoryColumnarCache needs to override clone method

2015-09-02 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-10422. Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 8578

[jira] [Resolved] (SPARK-10417) Iterating through Column results in infinite loop

2015-09-02 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-10417. Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 8574

[jira] [Updated] (SPARK-10436) spark-submit overwrites spark.files defaults with the job script filename

2015-09-03 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-10436: --- Target Version/s: 1.6.0 > spark-submit overwrites spark.files defaults with the job script filename

[jira] [Commented] (SPARK-10512) Fix @since when a function doesn't have doc

2015-09-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14736961#comment-14736961 ] Davies Liu commented on SPARK-10512: As we discussed here

[jira] [Closed] (SPARK-10512) Fix @since when a function doesn't have doc

2015-09-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu closed SPARK-10512. -- Resolution: Won't Fix > Fix @since when a function doesn't have doc >

[jira] [Resolved] (SPARK-10065) Avoid triple copy of var-length objects in Array in tungsten projection

2015-09-10 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-10065. Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 8496

[jira] [Resolved] (SPARK-9730) Sort Merge Join for Full Outer Join

2015-09-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-9730. --- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 8579

[jira] [Created] (SPARK-10542) The PySpark 1.5 closure serializer can't serialize a namedtuple instance.

2015-09-10 Thread Davies Liu (JIRA)
Davies Liu created SPARK-10542: -- Summary: The PySpark 1.5 closure serializer can't serialize a namedtuple instance. Key: SPARK-10542 URL: https://issues.apache.org/jira/browse/SPARK-10542 Project:

[jira] [Resolved] (SPARK-6931) python: struct.pack('!q', value) in write_long(value, stream) in serializers.py require int(but doesn't raise exceptions in common cases)

2015-09-10 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-6931. --- Resolution: Fixed Fix Version/s: 1.2.3 1.3.2 Issue resolved by pull request

[jira] [Created] (SPARK-10553) Allow Ctrl-C in pyspark shell to kill running job

2015-09-10 Thread Davies Liu (JIRA)
Davies Liu created SPARK-10553: -- Summary: Allow Ctrl-C in pyspark shell to kill running job Key: SPARK-10553 URL: https://issues.apache.org/jira/browse/SPARK-10553 Project: Spark Issue Type:

[jira] [Closed] (SPARK-10553) Allow Ctrl-C in pyspark shell to kill running job

2015-09-10 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu closed SPARK-10553. -- Resolution: Duplicate > Allow Ctrl-C in pyspark shell to kill running job >

[jira] [Resolved] (SPARK-6548) stddev_pop and stddev_samp aggregate functions

2015-09-12 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-6548. --- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 6297

[jira] [Assigned] (SPARK-10593) sql lateral view same name gives wrong value

2015-09-14 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-10593: -- Assignee: Davies Liu > sql lateral view same name gives wrong value >

[jira] [Resolved] (SPARK-10522) Nanoseconds part of Timestamp should be positive in parquet

2015-09-14 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-10522. Resolution: Fixed Fix Version/s: 1.5.1 1.6.0 Issue resolved by pull

[jira] [Commented] (SPARK-9325) Support `collect` on DataFrame columns

2015-09-14 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14744254#comment-14744254 ] Davies Liu commented on SPARK-9325: --- I would -1 on this. I'm worried that once we have

[jira] [Created] (SPARK-10572) Investigate the contentions bewteen tasks in the same executor

2015-09-11 Thread Davies Liu (JIRA)
Davies Liu created SPARK-10572: -- Summary: Investigate the contentions bewteen tasks in the same executor Key: SPARK-10572 URL: https://issues.apache.org/jira/browse/SPARK-10572 Project: Spark

[jira] [Resolved] (SPARK-9014) Allow Python spark API to use built-in exponential operator

2015-09-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-9014. --- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 8658

[jira] [Resolved] (SPARK-10542) The PySpark 1.5 closure serializer can't serialize a namedtuple instance.

2015-09-14 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-10542. Resolution: Fixed Fix Version/s: 1.5.1 1.6.0 Target

[jira] [Resolved] (SPARK-10459) PythonUDF could process UnsafeRow

2015-09-17 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-10459. Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 8616

[jira] [Resolved] (SPARK-10642) Crash in rdd.lookup() with "java.lang.Long cannot be cast to java.lang.Integer"

2015-09-17 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-10642. Resolution: Fixed Fix Version/s: 1.2.3 1.3.2 1.4.2

[jira] [Commented] (SPARK-10685) Misaligned data with RDD.zip and DataFrame.withColumn after repartition

2015-09-17 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804924#comment-14804924 ] Davies Liu commented on SPARK-10685: Internally, Python UDF use RDD.zip(), and compute the upstream

[jira] [Updated] (SPARK-10685) Misaligned data with RDD.zip and DataFrame.withColumn after repartition

2015-09-17 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-10685: --- Target Version/s: 1.5.1 Priority: Blocker (was: Major) Component/s: SQL

[jira] [Created] (SPARK-10522) Nanoseconds part of Timestamp should be positive in parquet

2015-09-09 Thread Davies Liu (JIRA)
Davies Liu created SPARK-10522: -- Summary: Nanoseconds part of Timestamp should be positive in parquet Key: SPARK-10522 URL: https://issues.apache.org/jira/browse/SPARK-10522 Project: Spark

[jira] [Closed] (SPARK-10544) Serialization of Python namedtuple subclasses in functions / closures is broken

2015-09-10 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu closed SPARK-10544. -- Resolution: Duplicate Fix Version/s: (was: 1.5.1) Target Version/s: 1.5.1 >

[jira] [Resolved] (SPARK-10443) Refactor SortMergeOuterJoin to reduce duplication

2015-09-10 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-10443. Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 8596

[jira] [Resolved] (SPARK-10056) PySpark Row - Support for row["columnName"] syntax

2015-09-10 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-10056. Resolution: Fixed Assignee: Yanbo Liang Fix Version/s: 1.6.0 > PySpark Row -

[jira] [Closed] (SPARK-10397) Make Python's SparkContext self-descriptive on "print sc"

2015-09-10 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu closed SPARK-10397. -- Resolution: Won't Fix > Make Python's SparkContext self-descriptive on "print sc" >

[jira] [Created] (SPARK-10593) sql lateral view same name gives wrong value

2015-09-14 Thread Davies Liu (JIRA)
Davies Liu created SPARK-10593: -- Summary: sql lateral view same name gives wrong value Key: SPARK-10593 URL: https://issues.apache.org/jira/browse/SPARK-10593 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-10593) sql lateral view same name gives wrong value

2015-09-14 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-10593: --- Description: This query will return wrong result: {code} select insideLayer1.json as

[jira] [Created] (SPARK-10859) Predicates pushed to InmemoryColumnarTableScan are not evaluated correctly

2015-09-28 Thread Davies Liu (JIRA)
Davies Liu created SPARK-10859: -- Summary: Predicates pushed to InmemoryColumnarTableScan are not evaluated correctly Key: SPARK-10859 URL: https://issues.apache.org/jira/browse/SPARK-10859 Project:

[jira] [Resolved] (SPARK-6919) Add .asDict method to StatCounter

2015-09-29 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-6919. --- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 5516

[jira] [Resolved] (SPARK-10866) [Spark SQL] [UDF] the floor function got wrong return value type

2015-10-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-10866. Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 8933

[jira] [Resolved] (SPARK-10865) [Spark SQL] [UDF] the ceil/ceiling function got wrong return value type

2015-10-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-10865. Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 8933

[jira] [Commented] (SPARK-10342) Cooperative memory management

2015-10-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940524#comment-14940524 ] Davies Liu commented on SPARK-10342: This will be used internal for SQL. For example, aggregation and

[jira] [Commented] (SPARK-10903) Make sqlContext global

2015-10-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940508#comment-14940508 ] Davies Liu commented on SPARK-10903: LGTM. Another question is that can we have different SQLContext

[jira] [Resolved] (SPARK-10415) Enhance Navigation Sidebar in PySpark API

2015-09-29 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-10415. Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 8571

[jira] [Resolved] (SPARK-9741) approx count distinct function

2015-09-30 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-9741. --- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 8362

[jira] [Resolved] (SPARK-10395) Simplify CatalystReadSupport

2015-09-28 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-10395. Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 8553

[jira] [Updated] (SPARK-10474) Aggregation failed with unable to acquire memory

2015-09-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-10474: --- Target Version/s: 1.6.0, 1.5.1 Priority: Blocker (was: Critical) > Aggregation failed

[jira] [Resolved] (SPARK-10461) make sure `input.primitive` is always variable name not code at GenerateUnsafeProjection

2015-09-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-10461. Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 8613

[jira] [Commented] (SPARK-10519) Investigate if we should encode timezone information to a timestamp value stored in JSON

2015-09-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737375#comment-14737375 ] Davies Liu commented on SPARK-10519: +1 for 3, user have the ability to control timezone, it's also

[jira] [Comment Edited] (SPARK-10309) Some tasks failed with Unable to acquire memory

2015-09-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737185#comment-14737185 ] Davies Liu edited comment on SPARK-10309 at 9/9/15 4:53 PM: [~nadenf] Thanks

[jira] [Commented] (SPARK-10309) Some tasks failed with Unable to acquire memory

2015-09-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737185#comment-14737185 ] Davies Liu commented on SPARK-10309: [~nadenf] Thanks for letting us know, just realized that your

[jira] [Commented] (SPARK-10439) Catalyst should check for overflow / underflow of date and timestamp values

2015-09-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737644#comment-14737644 ] Davies Liu commented on SPARK-10439: There are many places there could be overflow, even for A + B,

[jira] [Commented] (SPARK-10685) Misaligned data with RDD.zip and DataFrame.withColumn after repartition

2015-10-05 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14944112#comment-14944112 ] Davies Liu commented on SPARK-10685: [~jdanbrown] The zip after repartition (or shuffling) is another

[jira] [Resolved] (SPARK-10934) hashCode of unsafe array may crush

2015-10-05 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-10934. Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 8987

<    5   6   7   8   9   10   11   12   13   14   >