[jira] [Created] (SPARK-4425) Handle NaN cast to Timestamp correctly

2014-11-15 Thread Takuya Ueshin (JIRA)
Takuya Ueshin created SPARK-4425:


 Summary: Handle NaN cast to Timestamp correctly
 Key: SPARK-4425
 URL: https://issues.apache.org/jira/browse/SPARK-4425
 Project: Spark
  Issue Type: Bug
  Components: SQL
Reporter: Takuya Ueshin


{{Cast}} from {{NaN}} or {{Infinity}} of {{Double}} or {{Float}} to 
{{TimestampType}} throws {{NumberFormatException}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-4425) Handle NaN or Infinity cast to Timestamp correctly

2014-11-15 Thread Takuya Ueshin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takuya Ueshin updated SPARK-4425:
-
Summary: Handle NaN or Infinity cast to Timestamp correctly  (was: Handle 
NaN cast to Timestamp correctly)

 Handle NaN or Infinity cast to Timestamp correctly
 --

 Key: SPARK-4425
 URL: https://issues.apache.org/jira/browse/SPARK-4425
 Project: Spark
  Issue Type: Bug
  Components: SQL
Reporter: Takuya Ueshin

 {{Cast}} from {{NaN}} or {{Infinity}} of {{Double}} or {{Float}} to 
 {{TimestampType}} throws {{NumberFormatException}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4425) Handle NaN or Infinity cast to Timestamp correctly

2014-11-15 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14213452#comment-14213452
 ] 

Apache Spark commented on SPARK-4425:
-

User 'ueshin' has created a pull request for this issue:
https://github.com/apache/spark/pull/3283

 Handle NaN or Infinity cast to Timestamp correctly
 --

 Key: SPARK-4425
 URL: https://issues.apache.org/jira/browse/SPARK-4425
 Project: Spark
  Issue Type: Bug
  Components: SQL
Reporter: Takuya Ueshin

 {{Cast}} from {{NaN}} or {{Infinity}} of {{Double}} or {{Float}} to 
 {{TimestampType}} throws {{NumberFormatException}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4426) The symbol of BitwiseOr is wrong, should not be ''

2014-11-15 Thread Kousuke Saruta (JIRA)
Kousuke Saruta created SPARK-4426:
-

 Summary: The symbol of BitwiseOr is wrong, should not be ''
 Key: SPARK-4426
 URL: https://issues.apache.org/jira/browse/SPARK-4426
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.3.0
Reporter: Kousuke Saruta
Priority: Minor


The symbol of BitwiseOr is defined as '' but I think it's wrong. It should be 
'|'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4426) The symbol of BitwiseOr is wrong, should not be ''

2014-11-15 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14213472#comment-14213472
 ] 

Apache Spark commented on SPARK-4426:
-

User 'sarutak' has created a pull request for this issue:
https://github.com/apache/spark/pull/3284

 The symbol of BitwiseOr is wrong, should not be ''
 ---

 Key: SPARK-4426
 URL: https://issues.apache.org/jira/browse/SPARK-4426
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.3.0
Reporter: Kousuke Saruta
Priority: Minor

 The symbol of BitwiseOr is defined as '' but I think it's wrong. It should 
 be '|'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4427) In Spark Streaming when we perform window operations , it wil caluclate based on system time, i need to override it, i mean instead of getting current time app need to

2014-11-15 Thread ch.prasad (JIRA)
ch.prasad created SPARK-4427:


 Summary: In Spark Streaming when we perform window operations , it 
wil caluclate  based on system time, i need  to override it, i mean instead of 
getting current time app need to get from my text file.
 Key: SPARK-4427
 URL: https://issues.apache.org/jira/browse/SPARK-4427
 Project: Spark
  Issue Type: Bug
Reporter: ch.prasad


Please provide solution asap.
in window operation wen we give window size, it get data from rdd's by 
caluclating window size with current time , i need to change current time , to 
be read from my file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4426) The symbol of BitwiseOr is wrong, should not be ''

2014-11-15 Thread ch.prasad (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14213567#comment-14213567
 ] 

ch.prasad commented on SPARK-4426:
--

please see my issue .. i hope all of you!
 SPARK-4427
thanks..!

 The symbol of BitwiseOr is wrong, should not be ''
 ---

 Key: SPARK-4426
 URL: https://issues.apache.org/jira/browse/SPARK-4426
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.3.0
Reporter: Kousuke Saruta
Priority: Minor

 The symbol of BitwiseOr is defined as '' but I think it's wrong. It should 
 be '|'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4428) Use ${scala.binary.version} property for artifactId.

2014-11-15 Thread Takuya Ueshin (JIRA)
Takuya Ueshin created SPARK-4428:


 Summary: Use ${scala.binary.version} property for artifactId.
 Key: SPARK-4428
 URL: https://issues.apache.org/jira/browse/SPARK-4428
 Project: Spark
  Issue Type: Bug
  Components: Build
Reporter: Takuya Ueshin






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4428) Use ${scala.binary.version} property for artifactId.

2014-11-15 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14213593#comment-14213593
 ] 

Apache Spark commented on SPARK-4428:
-

User 'ueshin' has created a pull request for this issue:
https://github.com/apache/spark/pull/3285

 Use ${scala.binary.version} property for artifactId.
 

 Key: SPARK-4428
 URL: https://issues.apache.org/jira/browse/SPARK-4428
 Project: Spark
  Issue Type: Bug
  Components: Build
Reporter: Takuya Ueshin





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Closed] (SPARK-4403) Elastic allocation(spark.dynamicAllocation.enabled) results in task never being execued.

2014-11-15 Thread Egor Pahomov (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Egor Pahomov closed SPARK-4403.
---
Resolution: Invalid

 Elastic allocation(spark.dynamicAllocation.enabled) results in task never 
 being execued.
 

 Key: SPARK-4403
 URL: https://issues.apache.org/jira/browse/SPARK-4403
 Project: Spark
  Issue Type: Bug
  Components: Spark Core, YARN
Affects Versions: 1.1.1
Reporter: Egor Pahomov
 Attachments: ipython_out


 I execute ipython notebook + pyspark with spark.dynamicAllocation.enabled = 
 true. Task never ends.
 Code:
 {code}
 import sys
 from random import random
 from operator import add
 partitions = 10
 n = 10 * partitions
 def f(_):
 x = random() * 2 - 1
 y = random() * 2 - 1
 return 1 if x ** 2 + y ** 2  1 else 0
 count = sc.parallelize(xrange(1, n + 1), partitions).map(f).reduce(add)
 print Pi is roughly %f % (4.0 * count / n)
 {code}
 {code}
 IPYTHON_ARGS=notebook --profile=ydf --port $IPYTHON_PORT --port-retries=0 
 --ip='*' --no-browser
 pyspark \
 --verbose \
 --master yarn-client \
 --conf spark.driver.port=$((RANDOM_PORT + 2)) \
 --conf spark.broadcast.port=$((RANDOM_PORT + 3)) \
 --conf spark.replClassServer.port=$((RANDOM_PORT + 4)) \
 --conf spark.blockManager.port=$((RANDOM_PORT + 5)) \
 --conf spark.executor.port=$((RANDOM_PORT + 6)) \
 --conf spark.fileserver.port=$((RANDOM_PORT + 7)) \
 --conf spark.shuffle.service.enabled=true \
 --conf spark.dynamicAllocation.enabled=true \
 --conf spark.dynamicAllocation.minExecutors=1 \
 --conf spark.dynamicAllocation.maxExecutors=10 \
 --conf spark.ui.port=$SPARK_UI_PORT
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4429) Build for Scala 2.11 using sbt fails.

2014-11-15 Thread Takuya Ueshin (JIRA)
Takuya Ueshin created SPARK-4429:


 Summary: Build for Scala 2.11 using sbt fails.
 Key: SPARK-4429
 URL: https://issues.apache.org/jira/browse/SPARK-4429
 Project: Spark
  Issue Type: Bug
  Components: Build
Reporter: Takuya Ueshin


I tried to build for Scala 2.11 using sbt with the following command:

{quote}
$ sbt/sbt -Dscala-2.11 assembly
{quote}

but it ends with the following error messages:

{quote}
\[error\] (streaming-kafka/*:update) sbt.ResolveException: unresolved 
dependency: org.apache.kafka#kafka_2.11;0.8.0: not found
\[error\] (catalyst/*:update) sbt.ResolveException: unresolved dependency: 
org.scalamacros#quasiquotes_2.11;2.0.1: not found
{quote}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-4427) In Spark Streaming when we perform window operations , it wil caluclate based on system time, i need to override it, i mean instead of getting current time app need to

2014-11-15 Thread Sean Owen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-4427.
--
Resolution: Invalid

JIRAs are for reporting specific issues and proposing changes. This is a 
question, so would be better asked on the mailing list. I suggest you reword 
the question though to be more complete and specific as it's not quite clear 
what the issue is.

 In Spark Streaming when we perform window operations , it wil caluclate  
 based on system time, i need  to override it, i mean instead of getting 
 current time app need to get from my text file.
 

 Key: SPARK-4427
 URL: https://issues.apache.org/jira/browse/SPARK-4427
 Project: Spark
  Issue Type: Bug
Reporter: ch.prasad
   Original Estimate: 24h
  Remaining Estimate: 24h

 Please provide solution asap.
 in window operation wen we give window size, it get data from rdd's by 
 caluclating window size with current time , i need to change current time , 
 to be read from my file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4428) Use ${scala.binary.version} property for artifactId.

2014-11-15 Thread Mark Hamstra (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14213664#comment-14213664
 ] 

Mark Hamstra commented on SPARK-4428:
-

This is not a bug, nor is it a major issue, nor is parameterizing artifactId's 
in this way permissible.  This is a Won't Fix.

 Use ${scala.binary.version} property for artifactId.
 

 Key: SPARK-4428
 URL: https://issues.apache.org/jira/browse/SPARK-4428
 Project: Spark
  Issue Type: Bug
  Components: Build
Reporter: Takuya Ueshin





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-4428) Use ${scala.binary.version} property for artifactId.

2014-11-15 Thread Mark Hamstra (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hamstra resolved SPARK-4428.
-
Resolution: Won't Fix

 Use ${scala.binary.version} property for artifactId.
 

 Key: SPARK-4428
 URL: https://issues.apache.org/jira/browse/SPARK-4428
 Project: Spark
  Issue Type: Bug
  Components: Build
Reporter: Takuya Ueshin





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4402) Output path validation of an action statement resulting in runtime exception

2014-11-15 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14213729#comment-14213729
 ] 

Vijay commented on SPARK-4402:
--

Thanks for the reply [~srowen]

This is different scenario from the issue SPARK-1100.

Issue SPARK-1100 says that output directory is over written if it exists.
I think that fix works fine.

But, my concern is that spark throws a runtime exception if the output 
directory exists. This is happening after executing all the previous action 
statements and resulting in abrupt termination of the program. Result of the 
previous action statements is lost.

Please confirm whether this abrupt program termination is expected?

 Output path validation of an action statement resulting in runtime exception
 

 Key: SPARK-4402
 URL: https://issues.apache.org/jira/browse/SPARK-4402
 Project: Spark
  Issue Type: Wish
Reporter: Vijay
Priority: Minor

 Output path validation is happening at the time of statement execution as a 
 part of lazyevolution of action statement. But if the path already exists 
 then it throws a runtime exception. Hence all the processing completed till 
 that point is lost which results in resource wastage (processing time and CPU 
 usage).
 If this I/O related validation is done before the RDD action operations then 
 this runtime exception can be avoided.
 I believe similar validation/ feature is implemented in hadoop also.
 Example:
 SchemaRDD.saveAsTextFile() evaluated the path during runtime 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4402) Output path validation of an action statement resulting in runtime exception

2014-11-15 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14213739#comment-14213739
 ] 

Sean Owen commented on SPARK-4402:
--

Look at the code in PairRDDFunctions.saveAsHadoopDataset, which is what 
ultimately gets called. You'll see it try to check the output configuration 
upfront:

{code}
if (self.conf.getBoolean(spark.hadoop.validateOutputSpecs, true)) {
  // FileOutputFormat ignores the filesystem parameter
  val ignoredFs = FileSystem.get(hadoopConf)
  hadoopConf.getOutputFormat.checkOutputSpecs(ignoredFs, hadoopConf)
}
{code}

It's enabled by default. I wonder if the code path is somehow using a 
nonstandard InputFormat that doesn't check?
But this should cause an exception if the output path exists, before it starts, 
and was committed in SPARK-1100 for 1.0.

 Output path validation of an action statement resulting in runtime exception
 

 Key: SPARK-4402
 URL: https://issues.apache.org/jira/browse/SPARK-4402
 Project: Spark
  Issue Type: Wish
Reporter: Vijay
Priority: Minor

 Output path validation is happening at the time of statement execution as a 
 part of lazyevolution of action statement. But if the path already exists 
 then it throws a runtime exception. Hence all the processing completed till 
 that point is lost which results in resource wastage (processing time and CPU 
 usage).
 If this I/O related validation is done before the RDD action operations then 
 this runtime exception can be avoided.
 I believe similar validation/ feature is implemented in hadoop also.
 Example:
 SchemaRDD.saveAsTextFile() evaluated the path during runtime 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4419) Upgrade Snappy Java to 1.1.1.6

2014-11-15 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14213744#comment-14213744
 ] 

Apache Spark commented on SPARK-4419:
-

User 'JoshRosen' has created a pull request for this issue:
https://github.com/apache/spark/pull/3287

 Upgrade Snappy Java to 1.1.1.6
 --

 Key: SPARK-4419
 URL: https://issues.apache.org/jira/browse/SPARK-4419
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: Josh Rosen
Priority: Minor

 We should upgrade the Snappy Java library to get better error reporting 
 improvements.  I had tried this previously in SPARK-4056 but had to revert 
 that PR due to a memory leak / regression in Snappy Java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4430) Apache RAT Checks fail spuriously on test files

2014-11-15 Thread Ryan Williams (JIRA)
Ryan Williams created SPARK-4430:


 Summary: Apache RAT Checks fail spuriously on test files
 Key: SPARK-4430
 URL: https://issues.apache.org/jira/browse/SPARK-4430
 Project: Spark
  Issue Type: Bug
  Components: Build
Reporter: Ryan Williams


Several of my recent runs of {{./dev/run-tests}} have failed quickly due to 
Apache RAT checks, e.g.:

{code}
$ ./dev/run-tests

=
Running Apache RAT checks
=
Could not find Apache license headers in the following files:
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b732c105-4fd3-4330-ba6d-a366b340c303/test/28
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b732c105-4fd3-4330-ba6d-a366b340c303/test/29
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b732c105-4fd3-4330-ba6d-a366b340c303/test/30
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/10
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/11
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/12
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/13
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/14
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/15
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/16
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/17
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/18
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/19
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/20
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/21
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/22
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/23
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/24
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/25
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/26
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/27
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/28
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/29
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/30
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/7
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/8
 !? 
/Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/9
[error] Got a return code of 1 on line 114 of the run-tests script.
{code}

I think it's fair to say that these are not useful errors for {{run-tests}} to 
crash on. Ideally we could tell the linter which files we care about having it 
lint and which we don't.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4430) Apache RAT Checks fail spuriously on test files

2014-11-15 Thread Ryan Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14213811#comment-14213811
 ] 

Ryan Williams commented on SPARK-4430:
--

I did find [this RAT JIRA|https://issues.apache.org/jira/browse/RAT-161] that 
seems somewhat related, but if there's anything we could do to work around this 
in Spark in the shorter term that would be great too.

 Apache RAT Checks fail spuriously on test files
 ---

 Key: SPARK-4430
 URL: https://issues.apache.org/jira/browse/SPARK-4430
 Project: Spark
  Issue Type: Bug
  Components: Build
Reporter: Ryan Williams

 Several of my recent runs of {{./dev/run-tests}} have failed quickly due to 
 Apache RAT checks, e.g.:
 {code}
 $ ./dev/run-tests
 =
 Running Apache RAT checks
 =
 Could not find Apache license headers in the following files:
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b732c105-4fd3-4330-ba6d-a366b340c303/test/28
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b732c105-4fd3-4330-ba6d-a366b340c303/test/29
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b732c105-4fd3-4330-ba6d-a366b340c303/test/30
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/10
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/11
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/12
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/13
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/14
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/15
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/16
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/17
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/18
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/19
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/20
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/21
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/22
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/23
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/24
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/25
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/26
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/27
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/28
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/29
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/30
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/7
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/8
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/9
 [error] Got a return code of 1 on line 114 of the run-tests script.
 {code}
 I think it's fair to say that these are not useful errors for {{run-tests}} 
 to crash on. Ideally we could tell the linter which files we care about 
 having it lint and which we don't.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4431) Implement efficient activeIterator for dense and sparse vector

2014-11-15 Thread DB Tsai (JIRA)
DB Tsai created SPARK-4431:
--

 Summary: Implement efficient activeIterator for dense and sparse 
vector
 Key: SPARK-4431
 URL: https://issues.apache.org/jira/browse/SPARK-4431
 Project: Spark
  Issue Type: Improvement
  Components: MLlib
Reporter: DB Tsai


Previously, we were using Breeze's activeIterator to access the non-zero 
elements in sparse vector, and explicitly skipping the zero in dense/sparse 
vector using pattern matching. Due to the overhead, we switched back to native 
`while loop` in #SPARK-4129.

However, #SPARK-4129 requires de-reference the dv.values/sv.values in each 
access to the value, and the zeros in dense vector and sparse vector if exist 
are skipped in the add function call; the overall penalty will be around 10% 
compared with de-reference once outside the while block, and checking if zero 
before calling the add function. The code is branched out for dense and sparse 
vector, and it's not easy to maintain in the long term.

Not only this activeIterator implementation increases the performance, but the 
abstraction of accessing the non-zero elements in different vector type also 
helps the maintainability of codebase. In this PR, only 
MultivariateOnlineSummarizer uses new API as example, and others can be 
migrated to activeIterator later. 

Benchmarking with mnist8m dataset on single JVM with first 200 samples loaded 
in memory, and repeating 5000 times. 

Before change: 
Sparse Vector - 30.02
Dense Vector - 38.27

After this optimization:
Sparse Vector - 27.54
Dense Vector - 35.13



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4431) Implement efficient activeIterator for dense and sparse vector

2014-11-15 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14213826#comment-14213826
 ] 

Apache Spark commented on SPARK-4431:
-

User 'dbtsai' has created a pull request for this issue:
https://github.com/apache/spark/pull/3288

 Implement efficient activeIterator for dense and sparse vector
 --

 Key: SPARK-4431
 URL: https://issues.apache.org/jira/browse/SPARK-4431
 Project: Spark
  Issue Type: Improvement
  Components: MLlib
Reporter: DB Tsai

 Previously, we were using Breeze's activeIterator to access the non-zero 
 elements in sparse vector, and explicitly skipping the zero in dense/sparse 
 vector using pattern matching. Due to the overhead, we switched back to 
 native `while loop` in #SPARK-4129.
 However, #SPARK-4129 requires de-reference the dv.values/sv.values in each 
 access to the value, and the zeros in dense vector and sparse vector if exist 
 are skipped in the add function call; the overall penalty will be around 10% 
 compared with de-reference once outside the while block, and checking if zero 
 before calling the add function. The code is branched out for dense and 
 sparse vector, and it's not easy to maintain in the long term.
 Not only this activeIterator implementation increases the performance, but 
 the abstraction of accessing the non-zero elements in different vector type 
 also helps the maintainability of codebase. In this PR, only 
 MultivariateOnlineSummarizer uses new API as example, and others can be 
 migrated to activeIterator later. 
 Benchmarking with mnist8m dataset on single JVM with first 200 samples loaded 
 in memory, and repeating 5000 times. 
 Before change: 
 Sparse Vector - 30.02
 Dense Vector - 38.27
 After this optimization:
 Sparse Vector - 27.54
 Dense Vector - 35.13



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Reopened] (SPARK-4404) SparkSubmitDriverBootstrapper should stop after its SparkSubmit sub-process ends

2014-11-15 Thread Davies Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davies Liu reopened SPARK-4404:
---

After this patch, the SparkSubmitDriverBootstrapper will not exit if 
SparkSubmit die first.

 SparkSubmitDriverBootstrapper should stop after its SparkSubmit sub-process 
 ends 
 -

 Key: SPARK-4404
 URL: https://issues.apache.org/jira/browse/SPARK-4404
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.1.0
Reporter: WangTaoTheTonic
Assignee: WangTaoTheTonic
 Fix For: 1.2.0


 When we have spark.driver.extra* or spark.driver.memory in 
 SPARK_SUBMIT_PROPERTIES_FILE, spark-class will use 
 SparkSubmitDriverBootstrapper to launch driver.
 If we get process id of SparkSubmitDriverBootstrapper and wanna kill it 
 during its running, we expect its SparkSubmit sub-process stop also.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4404) SparkSubmitDriverBootstrapper should stop after its SparkSubmit sub-process ends

2014-11-15 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14213832#comment-14213832
 ] 

Apache Spark commented on SPARK-4404:
-

User 'davies' has created a pull request for this issue:
https://github.com/apache/spark/pull/3289

 SparkSubmitDriverBootstrapper should stop after its SparkSubmit sub-process 
 ends 
 -

 Key: SPARK-4404
 URL: https://issues.apache.org/jira/browse/SPARK-4404
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.1.0
Reporter: WangTaoTheTonic
Assignee: WangTaoTheTonic
 Fix For: 1.2.0


 When we have spark.driver.extra* or spark.driver.memory in 
 SPARK_SUBMIT_PROPERTIES_FILE, spark-class will use 
 SparkSubmitDriverBootstrapper to launch driver.
 If we get process id of SparkSubmitDriverBootstrapper and wanna kill it 
 during its running, we expect its SparkSubmit sub-process stop also.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-2335) k-Nearest Neighbor classification and regression for MLLib

2014-11-15 Thread Brian Gawalt (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Gawalt updated SPARK-2335:

Shepherd: Ashutosh Trivedi

 k-Nearest Neighbor classification and regression for MLLib
 --

 Key: SPARK-2335
 URL: https://issues.apache.org/jira/browse/SPARK-2335
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Brian Gawalt
Priority: Minor
  Labels: features

 The k-Nearest Neighbor model for classification and regression problems is a 
 simple and intuitive approach, offering a straightforward path to creating 
 non-linear decision/estimation contours. It's downsides -- high variance 
 (sensitivity to the known training data set) and computational intensity for 
 estimating new point labels -- both play to Spark's big data strengths: lots 
 of data mitigates data concerns; lots of workers mitigate computational 
 latency. 
 We should include kNN models as options in MLLib.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4404) SparkSubmitDriverBootstrapper should stop after its SparkSubmit sub-process ends

2014-11-15 Thread Davies Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14213838#comment-14213838
 ] 

Davies Liu commented on SPARK-4404:
---

Also pyspark failed to start, if spark.driver.memory is set, I have not 
investigate the details yet.

 SparkSubmitDriverBootstrapper should stop after its SparkSubmit sub-process 
 ends 
 -

 Key: SPARK-4404
 URL: https://issues.apache.org/jira/browse/SPARK-4404
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.1.0
Reporter: WangTaoTheTonic
Assignee: WangTaoTheTonic
 Fix For: 1.2.0


 When we have spark.driver.extra* or spark.driver.memory in 
 SPARK_SUBMIT_PROPERTIES_FILE, spark-class will use 
 SparkSubmitDriverBootstrapper to launch driver.
 If we get process id of SparkSubmitDriverBootstrapper and wanna kill it 
 during its running, we expect its SparkSubmit sub-process stop also.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4404) SparkSubmitDriverBootstrapper should stop after its SparkSubmit sub-process ends

2014-11-15 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14213840#comment-14213840
 ] 

Marcelo Vanzin commented on SPARK-4404:
---

Hmmm, my reading of the bug title and the bug description don't match.

Title: SparkSubmitDriverBootstrapper should stop after SparkSubmit ends
Descriptiont: killing SparkSubmitDriverBootstrapper should also kill 
SparkSubmit

Pardon if I misunderstood something, but could you clarify what's not working 
as expected?

 SparkSubmitDriverBootstrapper should stop after its SparkSubmit sub-process 
 ends 
 -

 Key: SPARK-4404
 URL: https://issues.apache.org/jira/browse/SPARK-4404
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.1.0
Reporter: WangTaoTheTonic
Assignee: WangTaoTheTonic
 Fix For: 1.2.0


 When we have spark.driver.extra* or spark.driver.memory in 
 SPARK_SUBMIT_PROPERTIES_FILE, spark-class will use 
 SparkSubmitDriverBootstrapper to launch driver.
 If we get process id of SparkSubmitDriverBootstrapper and wanna kill it 
 during its running, we expect its SparkSubmit sub-process stop also.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4432) Resource(InStream) is not closed in TachyonStore

2014-11-15 Thread shimingfei (JIRA)
shimingfei created SPARK-4432:
-

 Summary: Resource(InStream) is not closed in TachyonStore
 Key: SPARK-4432
 URL: https://issues.apache.org/jira/browse/SPARK-4432
 Project: Spark
  Issue Type: Bug
  Components: Block Manager
Affects Versions: 1.1.0
Reporter: shimingfei


In TachyonStore, InStream is not closed after data is read  from Tachyon. which 
makes the blocks in Tachyon locked after accessed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4432) Resource(InStream) is not closed in TachyonStore

2014-11-15 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14213845#comment-14213845
 ] 

Apache Spark commented on SPARK-4432:
-

User 'shimingfei' has created a pull request for this issue:
https://github.com/apache/spark/pull/3290

 Resource(InStream) is not closed in TachyonStore
 

 Key: SPARK-4432
 URL: https://issues.apache.org/jira/browse/SPARK-4432
 Project: Spark
  Issue Type: Bug
  Components: Block Manager
Affects Versions: 1.1.0
Reporter: shimingfei

 In TachyonStore, InStream is not closed after data is read  from Tachyon. 
 which makes the blocks in Tachyon locked after accessed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4433) Racing condition in zipWithIndex

2014-11-15 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-4433:


 Summary: Racing condition in zipWithIndex
 Key: SPARK-4433
 URL: https://issues.apache.org/jira/browse/SPARK-4433
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.0.2, 1.1.1, 1.2.0
Reporter: Xiangrui Meng
Assignee: Xiangrui Meng


Spark hangs with the following code:

{code}
sc.parallelize(1 to 10).zipWithIndex.repartition(10).count()
{code}

This is because ZippedWithIndexRDD triggers a job in getPartitions and it cause 
a deadlock in DAGScheduler.getPreferredLocs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-2335) k-Nearest Neighbor classification and regression for MLLib

2014-11-15 Thread Kaushik Ranjan (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14213849#comment-14213849
 ] 

Kaushik Ranjan commented on SPARK-2335:
---

Ha ha.
Shepherd and I were working on this together.

[~bgawalt] - if you could review the code and suggest changes(if any), I can 
take it forward

 k-Nearest Neighbor classification and regression for MLLib
 --

 Key: SPARK-2335
 URL: https://issues.apache.org/jira/browse/SPARK-2335
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Brian Gawalt
Priority: Minor
  Labels: features

 The k-Nearest Neighbor model for classification and regression problems is a 
 simple and intuitive approach, offering a straightforward path to creating 
 non-linear decision/estimation contours. It's downsides -- high variance 
 (sensitivity to the known training data set) and computational intensity for 
 estimating new point labels -- both play to Spark's big data strengths: lots 
 of data mitigates data concerns; lots of workers mitigate computational 
 latency. 
 We should include kNN models as options in MLLib.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4433) Racing condition in zipWithIndex

2014-11-15 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14213850#comment-14213850
 ] 

Apache Spark commented on SPARK-4433:
-

User 'mengxr' has created a pull request for this issue:
https://github.com/apache/spark/pull/3291

 Racing condition in zipWithIndex
 

 Key: SPARK-4433
 URL: https://issues.apache.org/jira/browse/SPARK-4433
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.0.2, 1.1.1, 1.2.0
Reporter: Xiangrui Meng
Assignee: Xiangrui Meng

 Spark hangs with the following code:
 {code}
 sc.parallelize(1 to 10).zipWithIndex.repartition(10).count()
 {code}
 This is because ZippedWithIndexRDD triggers a job in getPartitions and it 
 cause a deadlock in DAGScheduler.getPreferredLocs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4404) SparkSubmitDriverBootstrapper should stop after its SparkSubmit sub-process ends

2014-11-15 Thread Davies Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14213852#comment-14213852
 ] 

Davies Liu commented on SPARK-4404:
---

This JIRA is re-opened, the new bug is introduced by this JIRA.

Should I create a new JIRA for it?

 SparkSubmitDriverBootstrapper should stop after its SparkSubmit sub-process 
 ends 
 -

 Key: SPARK-4404
 URL: https://issues.apache.org/jira/browse/SPARK-4404
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.1.0
Reporter: WangTaoTheTonic
Assignee: WangTaoTheTonic
 Fix For: 1.2.0


 When we have spark.driver.extra* or spark.driver.memory in 
 SPARK_SUBMIT_PROPERTIES_FILE, spark-class will use 
 SparkSubmitDriverBootstrapper to launch driver.
 If we get process id of SparkSubmitDriverBootstrapper and wanna kill it 
 during its running, we expect its SparkSubmit sub-process stop also.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-4419) Upgrade Snappy Java to 1.1.1.6

2014-11-15 Thread Reynold Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reynold Xin resolved SPARK-4419.

   Resolution: Fixed
Fix Version/s: 1.2.0
 Assignee: Josh Rosen

 Upgrade Snappy Java to 1.1.1.6
 --

 Key: SPARK-4419
 URL: https://issues.apache.org/jira/browse/SPARK-4419
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: Josh Rosen
Assignee: Josh Rosen
Priority: Minor
 Fix For: 1.2.0


 We should upgrade the Snappy Java library to get better error reporting 
 improvements.  I had tried this previously in SPARK-4056 but had to revert 
 that PR due to a memory leak / regression in Snappy Java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Closed] (SPARK-4427) In Spark Streaming when we perform window operations , it wil caluclate based on system time, i need to override it, i mean instead of getting current time app need to g

2014-11-15 Thread Reynold Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reynold Xin closed SPARK-4427.
--

 In Spark Streaming when we perform window operations , it wil caluclate  
 based on system time, i need  to override it, i mean instead of getting 
 current time app need to get from my text file.
 

 Key: SPARK-4427
 URL: https://issues.apache.org/jira/browse/SPARK-4427
 Project: Spark
  Issue Type: Bug
Reporter: ch.prasad
   Original Estimate: 24h
  Remaining Estimate: 24h

 Please provide solution asap.
 in window operation wen we give window size, it get data from rdd's by 
 caluclating window size with current time , i need to change current time , 
 to be read from my file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-4426) The symbol of BitwiseOr is wrong, should not be ''

2014-11-15 Thread Reynold Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reynold Xin updated SPARK-4426:
---
Target Version/s: 1.2.0  (was: 1.2.0, 1.3.0)

 The symbol of BitwiseOr is wrong, should not be ''
 ---

 Key: SPARK-4426
 URL: https://issues.apache.org/jira/browse/SPARK-4426
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.3.0
Reporter: Kousuke Saruta
Assignee: Kousuke Saruta
Priority: Minor
 Fix For: 1.2.0


 The symbol of BitwiseOr is defined as '' but I think it's wrong. It should 
 be '|'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-4426) The symbol of BitwiseOr is wrong, should not be ''

2014-11-15 Thread Reynold Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reynold Xin resolved SPARK-4426.

   Resolution: Fixed
Fix Version/s: 1.2.0
 Assignee: Kousuke Saruta

 The symbol of BitwiseOr is wrong, should not be ''
 ---

 Key: SPARK-4426
 URL: https://issues.apache.org/jira/browse/SPARK-4426
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.3.0
Reporter: Kousuke Saruta
Assignee: Kousuke Saruta
Priority: Minor
 Fix For: 1.2.0


 The symbol of BitwiseOr is defined as '' but I think it's wrong. It should 
 be '|'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4430) Apache RAT Checks fail spuriously on test files

2014-11-15 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14213873#comment-14213873
 ] 

Sean Owen commented on SPARK-4430:
--

I imagine the real issue is that the test should clean up these files if it 
does not already. Those aren't in the source tree. It is not really a RAT 
config issue. If the files were left because tests crashed or were killed then 
just delete them. 

 Apache RAT Checks fail spuriously on test files
 ---

 Key: SPARK-4430
 URL: https://issues.apache.org/jira/browse/SPARK-4430
 Project: Spark
  Issue Type: Bug
  Components: Build
Reporter: Ryan Williams

 Several of my recent runs of {{./dev/run-tests}} have failed quickly due to 
 Apache RAT checks, e.g.:
 {code}
 $ ./dev/run-tests
 =
 Running Apache RAT checks
 =
 Could not find Apache license headers in the following files:
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b732c105-4fd3-4330-ba6d-a366b340c303/test/28
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b732c105-4fd3-4330-ba6d-a366b340c303/test/29
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b732c105-4fd3-4330-ba6d-a366b340c303/test/30
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/10
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/11
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/12
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/13
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/14
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/15
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/16
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/17
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/18
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/19
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/20
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/21
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/22
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/23
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/24
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/25
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/26
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/27
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/28
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/29
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/30
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/7
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/8
  !? 
 /Users/ryan/c/spark/streaming/FailureSuite/b98beebe-98b0-472a-b4a5-060bcd91e401/test/9
 [error] Got a return code of 1 on line 114 of the run-tests script.
 {code}
 I think it's fair to say that these are not useful errors for {{run-tests}} 
 to crash on. Ideally we could tell the linter which files we care about 
 having it lint and which we don't.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org