[jira] [Assigned] (SPARK-7176) Add validation functionality to individual Param

2015-04-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7176: --- Assignee: Joseph K. Bradley (was: Apache Spark) > Add validation functionality to individual

[jira] [Commented] (SPARK-7176) Add validation functionality to individual Param

2015-04-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516544#comment-14516544 ] Apache Spark commented on SPARK-7176: - User 'jkbradley' has created a pull request for

[jira] [Assigned] (SPARK-7176) Add validation functionality to individual Param

2015-04-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7176: --- Assignee: Apache Spark (was: Joseph K. Bradley) > Add validation functionality to individual

[jira] [Resolved] (SPARK-5946) Add Python API for Kafka direct stream

2015-04-27 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-5946. -- Resolution: Fixed Fix Version/s: 1.4.0 Assignee: Saisai Shao > Add Python API fo

[jira] [Commented] (SPARK-7180) SerializationDebugger fails with ArrayOutOfBoundsException

2015-04-27 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516536#comment-14516536 ] Patrick Wendell commented on SPARK-7180: /cc [~rxin] > SerializationDebugger fail

[jira] [Assigned] (SPARK-7181) External Sorter merge with aggregation go to an infinite loop when we have a total ordering

2015-04-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7181: --- Assignee: Apache Spark > External Sorter merge with aggregation go to an infinite loop when w

[jira] [Commented] (SPARK-7181) External Sorter merge with aggregation go to an infinite loop when we have a total ordering

2015-04-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516489#comment-14516489 ] Apache Spark commented on SPARK-7181: - User 'chouqin' has created a pull request for t

[jira] [Assigned] (SPARK-7181) External Sorter merge with aggregation go to an infinite loop when we have a total ordering

2015-04-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7181: --- Assignee: (was: Apache Spark) > External Sorter merge with aggregation go to an infinite

[jira] [Updated] (SPARK-7181) External Sorter merge with aggregation go to an infinite loop when we have a total ordering

2015-04-27 Thread Qiping Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qiping Li updated SPARK-7181: - Summary: External Sorter merge with aggregation go to an infinite loop when we have a total ordering (was

[jira] [Updated] (SPARK-7181) External Sorter merge with aggregation go to an infinite loop when we have a total ordering

2015-04-27 Thread Qiping Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qiping Li updated SPARK-7181: - Description: In the function {{mergeWithAggregation}} of {{ExternalSorter.scala}}, when there is a total

[jira] [Created] (SPARK-7188) Support math functions in DataFrames in Python

2015-04-27 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-7188: -- Summary: Support math functions in DataFrames in Python Key: SPARK-7188 URL: https://issues.apache.org/jira/browse/SPARK-7188 Project: Spark Issue Type: Sub-task

[jira] [Resolved] (SPARK-6829) Support math functions in DataFrames

2015-04-27 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-6829. Resolution: Fixed Fix Version/s: 1.4.0 > Support math functions in DataFrames > -

[jira] [Commented] (SPARK-6314) Failed to load application log data from FileStatus

2015-04-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516452#comment-14516452 ] Apache Spark commented on SPARK-6314: - User 'liyezhang556520' has created a pull reque

[jira] [Updated] (SPARK-7179) Add pattern after "show tables" to filter desire tablename

2015-04-27 Thread baishuo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] baishuo updated SPARK-7179: --- Priority: Minor (was: Major) > Add pattern after "show tables" to filter desire tablename > -

[jira] [Commented] (SPARK-6314) Failed to load application log data from FileStatus

2015-04-27 Thread Zhang, Liye (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516436#comment-14516436 ] Zhang, Liye commented on SPARK-6314: Hi [~srowen], this issue can be duplicate with [

[jira] [Assigned] (SPARK-7187) Exceptions in SerializationDebugger should not crash user code

2015-04-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7187: --- Assignee: Apache Spark (was: Andrew Or) > Exceptions in SerializationDebugger should not cra

[jira] [Updated] (SPARK-7187) Exceptions in SerializationDebugger should not crash user code

2015-04-27 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-7187: - Description: When issues like SPARK-7180 occurs, it ends up crashing user code through the ClosureCleaner

[jira] [Commented] (SPARK-7187) Exceptions in SerializationDebugger should not crash user code

2015-04-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516431#comment-14516431 ] Apache Spark commented on SPARK-7187: - User 'andrewor14' has created a pull request fo

[jira] [Assigned] (SPARK-7187) Exceptions in SerializationDebugger should not crash user code

2015-04-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7187: --- Assignee: Andrew Or (was: Apache Spark) > Exceptions in SerializationDebugger should not cra

[jira] [Commented] (SPARK-5189) Reorganize EC2 scripts so that nodes can be provisioned independent of Spark master

2015-04-27 Thread pengyunli (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516427#comment-14516427 ] pengyunli commented on SPARK-5189: -- i want to do this issue ,please assign it to me > Re

[jira] [Updated] (SPARK-7181) External Sorter merge with aggregation go to an infinity loop when we have a total ordering

2015-04-27 Thread Qiping Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qiping Li updated SPARK-7181: - Summary: External Sorter merge with aggregation go to an infinity loop when we have a total ordering (was

[jira] [Updated] (SPARK-7181) External Sorter merge with aggregation doesn't aggregate all the values for the same key when we have a total ordering

2015-04-27 Thread Qiping Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qiping Li updated SPARK-7181: - Description: In the function {{mergeWithAggregation}} of {{ExternalSorter.scala}}, when there is a total

[jira] [Created] (SPARK-7187) Exceptions in SerializationDebugger should not crash user code

2015-04-27 Thread Andrew Or (JIRA)
Andrew Or created SPARK-7187: Summary: Exceptions in SerializationDebugger should not crash user code Key: SPARK-7187 URL: https://issues.apache.org/jira/browse/SPARK-7187 Project: Spark Issue T

[jira] [Commented] (SPARK-6923) Spark SQL CLI does not read Data Source schema correctly

2015-04-27 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516410#comment-14516410 ] Cheng Hao commented on SPARK-6923: -- Sorry, after investigating, it probably not a bug of

[jira] [Updated] (SPARK-7075) Project Tungsten: Improving Physical Execution and Memory Management

2015-04-27 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7075: --- Summary: Project Tungsten: Improving Physical Execution and Memory Management (was: Improving Physica

[jira] [Updated] (SPARK-7079) Cache-aware external sort

2015-04-27 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7079: --- Summary: Cache-aware external sort (was: Cache-friendly external sort) > Cache-aware external sort >

[jira] [Updated] (SPARK-7082) Binary processing sort-merge join

2015-04-27 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7082: --- Summary: Binary processing sort-merge join (was: compact tuple based sort-merge join) > Binary proce

[jira] [Updated] (SPARK-7081) Faster sort-based shuffle path using binary processing cache-aware sort

2015-04-27 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7081: --- Summary: Faster sort-based shuffle path using binary processing cache-aware sort (was: Faster sort-ba

[jira] [Updated] (SPARK-7083) Binary processing dimensional join

2015-04-27 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7083: --- Summary: Binary processing dimensional join (was: compact tuple based dimensional join) > Binary pro

[jira] [Updated] (SPARK-7080) Binary processing based aggregate operator

2015-04-27 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7080: --- Summary: Binary processing based aggregate operator (was: compact tuple based aggregate operator) >

[jira] [Updated] (SPARK-7076) Binary processing compact tuple representation

2015-04-27 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7076: --- Summary: Binary processing compact tuple representation (was: Compact tuple representation) > Binary

[jira] [Updated] (SPARK-7077) Binary processing hash table for aggregation

2015-04-27 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7077: --- Summary: Binary processing hash table for aggregation (was: compact tuple based hash table for aggreg

[jira] [Updated] (SPARK-7078) Cache-aware binary processing in-memory sort

2015-04-27 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7078: --- Summary: Cache-aware binary processing in-memory sort (was: Cache-friendly in-memory sort on compact

[jira] [Updated] (SPARK-7077) Binary processing hash table for aggregation

2015-04-27 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7077: --- Description: Let's start with a hash table for aggregations. (was: Let's start with a compact hash ta

[jira] [Commented] (SPARK-6923) Spark SQL CLI does not read Data Source schema correctly

2015-04-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516396#comment-14516396 ] Apache Spark commented on SPARK-6923: - User 'chenghao-intel' has created a pull reques

[jira] [Created] (SPARK-7186) Decouple internal Row from external Row

2015-04-27 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-7186: -- Summary: Decouple internal Row from external Row Key: SPARK-7186 URL: https://issues.apache.org/jira/browse/SPARK-7186 Project: Spark Issue Type: New Feature

[jira] [Updated] (SPARK-7075) Improving Physical Execution and Memory Management

2015-04-27 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7075: --- Epic Name: Project Tungsten (was: Project Iron) > Improving Physical Execution and Memory Management

[jira] [Created] (SPARK-7185) Python API for math functions in DataFrames

2015-04-27 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-7185: -- Summary: Python API for math functions in DataFrames Key: SPARK-7185 URL: https://issues.apache.org/jira/browse/SPARK-7185 Project: Spark Issue Type: New Feature

[jira] [Created] (SPARK-7184) Investigate turning codegen on by default

2015-04-27 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-7184: -- Summary: Investigate turning codegen on by default Key: SPARK-7184 URL: https://issues.apache.org/jira/browse/SPARK-7184 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-7075) Improving Physical Execution and Memory Management

2015-04-27 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7075: --- Description: Based on our observation, majority of Spark workloads are not bottlenecked by I/O or net

[jira] [Updated] (SPARK-7075) Improving Physical Execution and Memory Management

2015-04-27 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7075: --- Description: Based on our observation, majority of Spark workloads are not bottlenecked by I/O or net

[jira] [Resolved] (SPARK-7174) Move calling `TaskScheduler.executorHeartbeatReceived` to another thread to avoid blocking the Akka thread pool

2015-04-27 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-7174. Resolution: Fixed Fix Version/s: 1.4.0 Assignee: Shixiong Zhu > Move calling `TaskSc

[jira] [Updated] (SPARK-7174) Move calling `TaskScheduler.executorHeartbeatReceived` to another thread to avoid blocking the Akka thread pool

2015-04-27 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7174: --- Issue Type: Sub-task (was: Bug) Parent: SPARK-5293 > Move calling `TaskScheduler.executorHear

[jira] [Updated] (SPARK-7180) SerializationDebugger fails with ArrayOutOfBoundsException

2015-04-27 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7180: --- Assignee: (was: Reynold Xin) > SerializationDebugger fails with ArrayOutOfBoundsException > --

[jira] [Updated] (SPARK-7180) SerializationDebugger fails with ArrayOutOfBoundsException

2015-04-27 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-7180: - Assignee: Reynold Xin > SerializationDebugger fails with ArrayOutOfBoundsException > -

[jira] [Updated] (SPARK-7182) [SQL] Can't remove columns from DataFrame or save DataFrame from a join due to duplicate columns

2015-04-27 Thread Don Drake (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Don Drake updated SPARK-7182: - Summary: [SQL] Can't remove columns from DataFrame or save DataFrame from a join due to duplicate columns

[jira] [Updated] (SPARK-7183) Memory leak in netty shuffle with spark standalone cluster

2015-04-27 Thread Jack Hu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jack Hu updated SPARK-7183: --- Summary: Memory leak in netty shuffle with spark standalone cluster (was: Memory leak with netty shuffle with

[jira] [Updated] (SPARK-7182) [SQL] Can't remove or save DataFrame from a join due to duplicate columns

2015-04-27 Thread Don Drake (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Don Drake updated SPARK-7182: - Description: I'm having trouble saving a dataframe as parquet after performing a simple table join. Belo

[jira] [Created] (SPARK-7183) Memory leak with netty shuffle with spark standalone cluster

2015-04-27 Thread Jack Hu (JIRA)
Jack Hu created SPARK-7183: -- Summary: Memory leak with netty shuffle with spark standalone cluster Key: SPARK-7183 URL: https://issues.apache.org/jira/browse/SPARK-7183 Project: Spark Issue Type: B

[jira] [Created] (SPARK-7182) [SQL] Can't remove or save DataFrame from a join due to duplicate columns

2015-04-27 Thread Don Drake (JIRA)
Don Drake created SPARK-7182: Summary: [SQL] Can't remove or save DataFrame from a join due to duplicate columns Key: SPARK-7182 URL: https://issues.apache.org/jira/browse/SPARK-7182 Project: Spark

[jira] [Updated] (SPARK-6921) Spark SQL API "saveAsParquetFile" will output tachyon file with different block size

2015-04-27 Thread zhangxiongfei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangxiongfei updated SPARK-6921: - Affects Version/s: 1.3.1 > Spark SQL API "saveAsParquetFile" will output tachyon file with differe

[jira] [Updated] (SPARK-7180) SerializationDebugger fails with ArrayOutOfBoundsException

2015-04-27 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-7180: - Description: Simple reproduction: {code} class Parent extends Serializable { val a = "a" val b = "b" }

[jira] [Commented] (SPARK-6921) Spark SQL API "saveAsParquetFile" will output tachyon file with different block size

2015-04-27 Thread zhangxiongfei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516299#comment-14516299 ] zhangxiongfei commented on SPARK-6921: -- I think the reason for this issue is below: 1

[jira] [Updated] (SPARK-7180) SerializationDebugger fails with ArrayOutOfBoundsException

2015-04-27 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-7180: - Description: Simple reproduction: {code} class Parent extends Serializable { val a = "a" val b = "b" }

[jira] [Created] (SPARK-7181) External Sorter merge with aggregation doesn't aggregate all the values for the same key when we have a total ordering

2015-04-27 Thread Qiping Li (JIRA)
Qiping Li created SPARK-7181: Summary: External Sorter merge with aggregation doesn't aggregate all the values for the same key when we have a total ordering Key: SPARK-7181 URL: https://issues.apache.org/jira/browse/

[jira] [Updated] (SPARK-7180) SerializationDebugger fails with ArrayOutOfBoundsException

2015-04-27 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-7180: - Summary: SerializationDebugger fails with ArrayOutOfBoundsException (was: SerializationDebugger fails whe

[jira] [Commented] (SPARK-6530) ChiSqSelector transformer

2015-04-27 Thread Xusen Yin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516286#comment-14516286 ] Xusen Yin commented on SPARK-6530: -- I start to do it. > ChiSqSelector transformer >

[jira] [Commented] (SPARK-7139) Allow received block metadata to be saved to WAL and recovered on driver failure

2015-04-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516279#comment-14516279 ] Apache Spark commented on SPARK-7139: - User 'tdas' has created a pull request for this

[jira] [Updated] (SPARK-6923) Spark SQL CLI does not read Data Source schema correctly

2015-04-27 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-6923: Priority: Blocker (was: Major) Target Version/s: 1.4.0 > Spark SQL CLI does not read Data Sourc

[jira] [Updated] (SPARK-7180) SerializationDebugger fails when attempting to serialize a FunSuite

2015-04-27 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-7180: - Description: This is most likely not specific to FunSuite, but I when I try to serialize one (don't ask w

[jira] [Created] (SPARK-7180) SerializationDebugger fails when attempting to serialize a FunSuite

2015-04-27 Thread Andrew Or (JIRA)
Andrew Or created SPARK-7180: Summary: SerializationDebugger fails when attempting to serialize a FunSuite Key: SPARK-7180 URL: https://issues.apache.org/jira/browse/SPARK-7180 Project: Spark Is

[jira] [Commented] (SPARK-7179) Add pattern after "show tables" to filter desire tablename

2015-04-27 Thread baishuo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516273#comment-14516273 ] baishuo commented on SPARK-7179: the semantic of "show tables" in hive is like "show table

[jira] [Created] (SPARK-7179) Add pattern after "show tables" to filter desire tablename

2015-04-27 Thread baishuo (JIRA)
baishuo created SPARK-7179: -- Summary: Add pattern after "show tables" to filter desire tablename Key: SPARK-7179 URL: https://issues.apache.org/jira/browse/SPARK-7179 Project: Spark Issue Type: New

[jira] [Updated] (SPARK-6923) Spark SQL CLI does not read Data Source schema correctly

2015-04-27 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-6923: Description: {code:java} HiveContext hctx = new HiveContext(sc); List sample = new ArrayList(); sample.add(

[jira] [Commented] (SPARK-7154) Spark distro appears to be pulling in incorrect protobuf classes

2015-04-27 Thread Dmitry Goldenberg (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516242#comment-14516242 ] Dmitry Goldenberg commented on SPARK-7154: -- Could this possibly be documented in

[jira] [Commented] (SPARK-7154) Spark distro appears to be pulling in incorrect protobuf classes

2015-04-27 Thread Dmitry Goldenberg (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516241#comment-14516241 ] Dmitry Goldenberg commented on SPARK-7154: -- I've added this info to the 'mirror'

[jira] [Assigned] (SPARK-5895) Add VectorSlicer

2015-04-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-5895: --- Assignee: Xusen Yin (was: Apache Spark) > Add VectorSlicer > > >

[jira] [Assigned] (SPARK-5895) Add VectorSlicer

2015-04-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-5895: --- Assignee: Apache Spark (was: Xusen Yin) > Add VectorSlicer > > >

[jira] [Commented] (SPARK-5895) Add VectorSlicer

2015-04-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516240#comment-14516240 ] Apache Spark commented on SPARK-5895: - User 'yinxusen' has created a pull request for

[jira] [Commented] (SPARK-6923) Spark SQL CLI does not read Data Source schema correctly

2015-04-27 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516226#comment-14516226 ] Cheng Hao commented on SPARK-6923: -- [~pin_zhang], I agree [~marmbrus], you're hitting a b

[jira] [Commented] (SPARK-5456) Decimal Type comparison issue

2015-04-27 Thread Yi Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516195#comment-14516195 ] Yi Zhang commented on SPARK-5456: - I got same issue. When I access Postgresql based on Spa

[jira] [Updated] (SPARK-7090) Introduce LDAOptimizer to LDA to further improve extensibility

2015-04-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-7090: - Target Version/s: 1.4.0 > Introduce LDAOptimizer to LDA to further improve extensibility >

[jira] [Updated] (SPARK-7090) Introduce LDAOptimizer to LDA to further improve extensibility

2015-04-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-7090: - Assignee: yuhao yang > Introduce LDAOptimizer to LDA to further improve extensibility > --

[jira] [Resolved] (SPARK-7090) Introduce LDAOptimizer to LDA to further improve extensibility

2015-04-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-7090. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 5661 [https

[jira] [Closed] (SPARK-5870) GradientBoostedTrees should cache residuals from partial model

2015-04-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley closed SPARK-5870. Resolution: Duplicate Fix Version/s: 1.4.0 > GradientBoostedTrees should cache residu

[jira] [Updated] (SPARK-5870) GradientBoostedTrees should cache residuals from partial model

2015-04-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5870: - Fix Version/s: (was: 1.4.0) > GradientBoostedTrees should cache residuals from partial

[jira] [Resolved] (SPARK-6253) Add LassoModel to __all__ in regression.py

2015-04-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-6253. -- Resolution: Fixed Fix Version/s: 1.3.1 1.4.0 > Add LassoModel

[jira] [Assigned] (SPARK-6253) Add LassoModel to __all__ in regression.py

2015-04-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-6253: --- Assignee: Joseph K. Bradley (was: Apache Spark) > Add LassoModel to __all__ in regression.py

[jira] [Commented] (SPARK-6253) Add LassoModel to __all__ in regression.py

2015-04-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516169#comment-14516169 ] Apache Spark commented on SPARK-6253: - User 'jkbradley' has created a pull request for

[jira] [Assigned] (SPARK-6253) Add LassoModel to __all__ in regression.py

2015-04-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-6253: --- Assignee: Apache Spark (was: Joseph K. Bradley) > Add LassoModel to __all__ in regression.py

[jira] [Commented] (SPARK-7065) Clear the cached locations mapping after every stage to avoid inconsistent status

2015-04-27 Thread Susu Xie (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516162#comment-14516162 ] Susu Xie commented on SPARK-7065: - We do this because we found is some case (we've not ide

[jira] [Commented] (SPARK-5100) Spark Thrift server monitor page

2015-04-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516155#comment-14516155 ] Apache Spark commented on SPARK-5100: - User 'tianyi' has created a pull request for th

[jira] [Updated] (SPARK-6390) Add MatrixUDT in PySpark

2015-04-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-6390: - Target Version/s: 1.5.0 (was: 1.4.0) > Add MatrixUDT in PySpark > ---

[jira] [Reopened] (SPARK-5726) Hadamard Vector Product Transformer

2015-04-27 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reopened SPARK-5726: -- > Hadamard Vector Product Transformer > --- > > Key:

[jira] [Updated] (SPARK-5726) Hadamard Vector Product Transformer

2015-04-27 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5726: - Shepherd: Xiangrui Meng > Hadamard Vector Product Transformer > --

[jira] [Commented] (SPARK-7064) Adding binary sparse vector support

2015-04-27 Thread Susu Xie (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516141#comment-14516141 ] Susu Xie commented on SPARK-7064: - Here is some context. During our working with Spark, we

[jira] [Assigned] (SPARK-6948) VectorAssembler should choose dense/sparse for output based on number of zeros

2015-04-27 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reassigned SPARK-6948: Assignee: Xiangrui Meng > VectorAssembler should choose dense/sparse for output based on nu

[jira] [Updated] (SPARK-6160) ChiSqSelector should keep test statistic info

2015-04-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-6160: - Target Version/s: 1.5.0 (was: 1.4.0) > ChiSqSelector should keep test statistic info > --

[jira] [Updated] (SPARK-4980) Add decay factors to streaming linear methods

2015-04-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-4980: - Target Version/s: 1.5.0 (was: 1.4.0) > Add decay factors to streaming linear methods > --

[jira] [Updated] (SPARK-4285) Transpose RDD[Vector] to column store for ML

2015-04-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-4285: - Target Version/s: 1.5.0 (was: 1.4.0) > Transpose RDD[Vector] to column store for ML > ---

[jira] [Updated] (SPARK-5562) LDA should handle empty documents

2015-04-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5562: - Target Version/s: 1.5.0 (was: 1.4.0) > LDA should handle empty documents > --

[jira] [Updated] (SPARK-6164) CrossValidatorModel should keep stats from fitting

2015-04-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-6164: - Target Version/s: 1.5.0 (was: 1.4.0) > CrossValidatorModel should keep stats from fitting

[jira] [Updated] (SPARK-6684) Add checkpointing to GradientBoostedTrees

2015-04-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-6684: - Target Version/s: 1.5.0 (was: 1.4.0) > Add checkpointing to GradientBoostedTrees > --

[jira] [Updated] (SPARK-6312) ChiSqTest should check for too few counts

2015-04-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-6312: - Target Version/s: 1.5.0 (was: 1.4.0) > ChiSqTest should check for too few counts > --

[jira] [Assigned] (SPARK-6685) Use DSYRK to compute AtA in ALS

2015-04-27 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reassigned SPARK-6685: Assignee: Xiangrui Meng > Use DSYRK to compute AtA in ALS > ---

[jira] [Updated] (SPARK-6295) spark.ml.Evaluator should have evaluate method not taking ParamMap

2015-04-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-6295: - Assignee: Joseph K. Bradley > spark.ml.Evaluator should have evaluate method not taking Pa

[jira] [Updated] (SPARK-5283) ML sharedParams should be private

2015-04-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5283: - Assignee: Joseph K. Bradley (was: Xiangrui Meng) > ML sharedParams should be private > --

[jira] [Updated] (SPARK-2335) k-Nearest Neighbor classification and regression for MLLib

2015-04-27 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2335: - Target Version/s: (was: 1.4.0) > k-Nearest Neighbor classification and regression for MLLib > --

[jira] [Updated] (SPARK-2336) Approximate k-NN Models for MLLib

2015-04-27 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2336: - Target Version/s: (was: 1.4.0) > Approximate k-NN Models for MLLib > ---

[jira] [Updated] (SPARK-5893) Add Bucketizer

2015-04-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5893: - Assignee: Joseph K. Bradley > Add Bucketizer > -- > > Key: SPA

  1   2   3   4   5   >