Re: [VOTE] Designating maintainers for some Spark components

2014-11-11 Thread Yu Ishikawa
: Matei, Patrick, Reynold > > - Job scheduler: Matei, Kay, Patrick > > - Shuffle and network: Reynold, Aaron, Matei > > - Block manager: Reynold, Aaron > > - YARN: Tom, Andrew Or > > - Python: Josh, Matei > > - MLlib: Xiangrui, Matei > > - SQL: M

Re: JIRA + PR backlog

2014-11-11 Thread Yu Ishikawa
Great jobs! I didn't know "Spark PR Dashboard." Thanks Yu Ishikawa - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/JIRA-PR-backlog-tp9157p9282.html Sent from the Apache Spark Developers List mailing list archiv

[mllib] Which is the correct package to add a new algorithm?

2014-11-27 Thread Yu Ishikawa
Hi all, Spark ML alpha version exists in the current master branch on Github. If we want to add new machine learning algorithms or to modify algorithms which already exists, which package should we implement them at org.apache.spark.mllib or org.apache.spark.ml? thanks, Yu - -- Yu

Re: [mllib] Which is the correct package to add a new algorithm?

2014-11-30 Thread Yu Ishikawa
ibute new algorithms to spark.mllib. thanks, Yu ----- -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Which-is-the-correct-package-to-add-a-new-algorithm-tp9540p9575.html Sent from the Apache Spark Developers List mailing list a

[mllib] Is there any bugs to divide a Breeze sparse vectors at Spark v1.3.0-rc3?

2015-03-15 Thread Yu Ishikawa
, 0.0, 0.0, 0.1, 0.0) org.scalatest.exceptions.TestFailedException: DenseVector(0.0, 0.0, 0.0, 0.0, 0.0, 0.0) did not equal DenseVector(0.0, 0.0, 0.0, 0.0, 0.1, 0.0) ``` Thanks, Yu Ishikawa ----- -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.

Re: [mllib] Is there any bugs to divide a Breeze sparse vectors at Spark v1.3.0-rc3?

2015-03-15 Thread Yu Ishikawa
later - ASF JIRA https://issues.apache.org/jira/browse/SPARK-6341 Thanks, Yu Ishikawa - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Is-there-any-bugs-to-divide-a-Breeze-sparse-vectors-at-Spark-v1-3-0-rc3-tp11056p11058.html Sent

Re: [mllib] Is there any bugs to divide a Breeze sparse vectors at Spark v1.3.0-rc3?

2015-03-18 Thread Yu Ishikawa
Sorry for the delay in replying. I moved from Tokyo to New York in order to attend Spark Summit East. I verified the snapshot and the difference. https://github.com/scalanlp/breeze/commit/f61d2f61137807651fc860404a244640e213f6d3 Thank you for your great work! Yu Ishikawa - -- Yu Ishikawa

[mllib] Deprecate static train and use builder instead for Scala/Java

2015-04-06 Thread Yu Ishikawa
use builder instead for Scala/Java https://issues.apache.org/jira/browse/SPARK-6682 Thanks Yu Ishikawa - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Deprecate-static-train-and-use-builder-instead-for-Scala-Java-tp11438

[SparkR] Have we already had any lint for SparkR?

2015-06-17 Thread Yu Ishikawa
am to lint the R code to follow the Google Style Guide - Google Project Hosting https://code.google.com/p/google-rlint/ I tried to find an issue like that. I couldn't find one. I'm afraid we have already had a mechanism to check R codes. Thanks, Yu - -- Yu Ishikawa -- View thi

Re: [SparkR] Have we already had any lint for SparkR?

2015-06-17 Thread Yu Ishikawa
Hi Shivaram, Thank you for your reply and letting me know that. I will join the discussion on JIRA later. Thanks, Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/SparkR-Have-we-already-had-any-lint-for-SparkR-tp12773p12775

[mllib] Refactoring some spark.mllib model classes in Python not inheriting JavaModelWrapper

2015-06-17 Thread Yu Ishikawa
- -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Refactoring-some-spark-mllib-model-classes-in-Python-not-inheriting-JavaModelWrapper-tp12781.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com

Re: [mllib] Refactoring some spark.mllib model classes in Python not inheriting JavaModelWrapper

2015-06-19 Thread Yu Ishikawa
Hi Xiangrui I got it. I will try to refactor any model class not inheriting JavaModelWrapper and show you it. Thanks, Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Refactoring-some-spark-mllib-model-classes-in-Python

Re: Workaround for problems with OS X + JIRA Client

2015-06-19 Thread Yu Ishikawa
Hi Sean, That sounds interesting. I didn't know the client. I will try it later. Thank you for sharing the information. Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Workaround-for-problems-with-OS-X-JIRA-C

[pyspark][mllib] What is the best way to treat int and long int between python2.6/python3.4 and Java?

2015-06-20 Thread Yu Ishikawa
tackling "[SPARK-6259] Python API for LDA". We wonder if we should create a wrapper class for the document of LDA or not. Do you have any idea to implement it? https://issues.apache.org/jira/browse/SPARK-6259 Thanks, Yu ----- -- Yu Ishikawa -- View this message in context: http://ap

[jenkins] ERROR: Publisher 'Publish JUnit test result report' failed: No test report files were found. Configuration error?

2015-06-21 Thread Yu Ishikawa
n error? Finished: FAILURE ``` It seems that the unit testing related to the PR passed. However, the Jenkins posted "Merged build finished. Test FAILed." to github. https://github.com/apache/spark/pull/6926 Thanks Yu - -- Yu Ishikawa -- View this message in context: http://apa

Re: [jenkins] ERROR: Publisher 'Publish JUnit test result report' failed: No test report files were found. Configuration error?

2015-06-21 Thread Yu Ishikawa
Hi Josh, Thank you for your continuous support. I'm looking forward to the fix. Thanks, Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/jenkins-ERROR-Publisher-Publish-JUnit-test-result-report-failed-No-test-report-

[pyspark] What is the best way to run a minimum unit testing related to our developing module?

2015-07-01 Thread Yu Ishikawa
s the best way to run a minimum unit testing related to our developing modules under the current version? Of course, I think it would be nice to be able to identify testing targets with the script like scala's sbt. Thanks, Yu ----- -- Yu Ishikawa -- View this message in context: http://ap

Re: [pyspark] What is the best way to run a minimum unit testing related to our developing module?

2015-07-01 Thread Yu Ishikawa
Thanks! --Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/pyspark-What-is-the-best-way-to-run-a-minimum-unit-testing-related-to-our-developing-module-tp12987p12989.html Sent from the Apache Spark Developers List mailing list

Re: [pyspark] What is the best way to run a minimum unit testing related to our developing module?

2015-07-01 Thread Yu ISHIKAWA
Thanks! --Yu 2015-07-02 13:13 GMT+09:00 Reynold Xin : > Run > > ./python/run-tests --help > > and you will see. :) > > On Wed, Jul 1, 2015 at 9:10 PM, Yu Ishikawa > wrote: > >> Hi all, >> >> When I develop pyspark modules, such as adding a sp

What is the difference between SlowSparkPullRequestBuilder and SparkPullRequestBuilder?

2015-07-21 Thread Yu Ishikawa
Hi all, When we send a PR, it seems that two requests to run tests are thrown to the Jenkins sometimes. What is the difference between SparkPullRequestBuilder and SlowSparkPullRequestBuilder? Thanks, Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers

Re: What is the difference between SlowSparkPullRequestBuilder and SparkPullRequestBuilder?

2015-07-22 Thread Yu Ishikawa
Hi Andrew, I understand that there is no difference currently. Thanks, Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/What-is-the-difference-between-SlowSparkPullRequestBuilder-and-SparkPullRequestBuilder-tp13377p13380.html

Is `dev/lint-python` broken?

2015-07-27 Thread Yu Ishikawa
on: line 64: `easy_install -d "$PYLINT_HOME" pylint==1.4.4 &>> "$PYLINT_INSTALL_INFO"' ``` If the redirect is a syntax error, I'll send a PR to fix. Thanks, Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001

Re: Is `dev/lint-python` broken?

2015-07-27 Thread Yu Ishikawa
Hi Sean, Thank you for answering my question. It seems that I used an old version bash which is the default Mac bash. ``` $> bash --version GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin14) Copyright (C) 2007 Free Software Foundation, Inc. share_history ``` Thanks, Yu - --

Re: Is `dev/lint-python` broken?

2015-07-27 Thread Yu Ishikawa
I'm using 10.10.4. And Xcode is version 6.4. Maybe, it isn't old. I guess the old bash version causes the problem. I'll try to install another bash with brew. - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Is-

[SparkR] lint script for SpakrR

2015-09-01 Thread Yu Ishikawa
13 Thanks Shivaram and Josh, I couldn't have done it without you. Thanks Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/SparkR-lint-script-for-SpakrR-tp13923.html Sent from the Apache Spark Developers List mailing list archiv

Re: [ANNOUNCE] Announcing Spark 1.5.0

2015-09-09 Thread Yu Ishikawa
Great work, everyone! - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/ANNOUNCE-Announcing-Spark-1-5-0-tp14013p14015.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com

How do we convert a Dataset includes timestamp columns to RDD?

2015-12-16 Thread Yu Ishikawa
aSerializer.scala:100) at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:301) ... 68 more ``` Thanks, Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/How-do-we-convert-a-Dataset-includes

Re: How do we convert a Dataset includes timestamp columns to RDD?

2015-12-17 Thread Yu Ishikawa
Hi Kosuke, Thank you for the PR. I think we should fix this bug before releasing Spark 1.6 ASAP. I'm looking forward to merging it. Thanks, Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/How-do-we-convert-a-Dataset-inc

Is there any way to select columns of Dataset in addition to the combination of `expr` and `as`?

2015-12-18 Thread Yu Ishikawa
ds.select("id").show :34: error: type mismatch; found : String("id") required: org.apache.spark.sql.TypedColumn[Person,?] ds.select("id").show ``` Best, Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Is-th

Re: RDD[Vector] Immutability issue

2015-12-28 Thread Yu Ishikawa
Hi salexln, Did you reproduce the same issue under any different condition? I can't reproduce this issue, since I don't know the details of the algorithm. Please let me know more detailed condition or the repository? Thanks, Yu ----- -- Yu Ishikawa -- View this message in cont

Can I translate the documentations of Spark in Japanese?

2014-07-27 Thread Yu Ishikawa
Hi all, I'm Yu Ishikawa, a Japanese. I would like to translate the documentations of Spark 1.0.x officially. If I will translate them and send a pull request, then can you merge it ? And where is the best directory to create the Japanese documentations ? Best, Yu -- View this messa

Re: Can I translate the documentations of Spark in Japanese?

2014-07-31 Thread Yu Ishikawa
Hi Kenichi Takagiwa, Thank you for commenting. I am going to proceed with the translation, will you please help me. Further details will be sent later. Best, Yu -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Can-I-translate-the-documentations-of-Sp

Re: Can I translate the documentations of Spark in Japanese?

2014-07-31 Thread Yu Ishikawa
Hi Nick, > I know some projects get translations crowdsourced via one website or > other. Thank you for your comments. I think crowdsourced translation is fit for the translation project on github. Best, Yu -- View this message in context: http://apache-spark-developers-list.1001551.n3.nab

Re: Contributing to MLlib: Proposal for Clustering Algorithms

2014-08-13 Thread Yu Ishikawa
, please let me know. best, Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Contributing-to-MLlib-Proposal-for-Clustering-Algorithms-tp7212p7822.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com

[mllib] Add multiplying large scale matrices

2014-09-05 Thread Yu Ishikawa
Hi all, It seems that there is a method to multiply a RowMatrix and a (local) Matrix. However, there is not a method to multiply a large scale matrix and another one in Spark. It would be helpful. Does anyone have a plan to add multiplying large scale matrices? Or shouldn't we support it in Sp

Re: [mllib] Add multiplying large scale matrices

2014-09-05 Thread Yu Ishikawa
Hi RJ, Thank you for your comment. I am interested in to have other matrix operations too. I will create a JIRA issue in the first place. thanks, -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Add-multiplying-large-scale-matrices-tp8291p8293.h

Re: [mllib] Add multiplying large scale matrices

2014-09-05 Thread Yu Ishikawa
Hi Evan, That's sounds interesting. Here is the ticket which I created. https://issues.apache.org/jira/browse/SPARK-3416 thanks, -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Add-multiplying-large-scale-matrices-tp8291p8296.html Sent from

Re: [mllib] Add multiplying large scale matrices

2014-09-06 Thread Yu Ishikawa
Hi Jeremy, Great work! I'm interested in your work. If there is your code on github, could you let me know? -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Add-multiplying-large-scale-matrices-tp8291p8309.html Sent fro

Re: [mllib] Add multiplying large scale matrices

2014-09-06 Thread Yu Ishikawa
Hi Rong, Great job! Thank you for let me know your work. I will read the source code of saury later. Although AMPLab is working to implement them, would you like to merge it into Spark? Best, -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3

Re: [mllib] Add multiplying large scale matrices

2014-09-08 Thread Yu Ishikawa
Hi Xiangrui Meng, Thank you for your comment and creating tickets. The ticket which I created would be moved to your tickets. I will close my ticket, and then will link it to yours later. Best, Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3

Re: MLlib enable extension of the LabeledPoint class

2014-09-25 Thread Yu Ishikawa
ughts on it. For example, ``` abstract class LabeledPoint[T](label: T, features: Vector) ``` thanks ----- -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/MLlib-enable-extension-of-the-LabeledPoint-class-tp8546p8549.html Sent from th

Re: MLlib enable extension of the LabeledPoint class

2014-09-25 Thread Yu Ishikawa
Hi Egor Pahomov, Thank you for your comment! - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/MLlib-enable-extension-of-the-LabeledPoint-class-tp8546p8551.html Sent from the Apache Spark Developers List mailing list archive at

What is the best way to build my developing Spark for testing on EC2?

2014-10-02 Thread Yu Ishikawa
script for a developing version like spark-ec2 script? Or if you have any good idea to evaluate the performance of a developing MLlib algorithm on a spark cluster like EC2, could you tell me? Best, - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list

Re: What is the best way to build my developing Spark for testing on EC2?

2014-10-06 Thread Yu Ishikawa
crude like bash scripts running my program and > collecting output. It's just as you thought. I agree with you. > You could have a look at the spark-perf repo if you want something a > little better principled/automatic. I overlooked this. I will give it a try. best, - -- Yu Ish

Standardized Distance Functions in MLlib

2014-10-08 Thread Yu Ishikawa
community. https://github.com/apache/spark/pull/1964#issuecomment-54953348 Best, ----- -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Standardized-Distance-Functions-in-MLlib-tp8697.html Sent from the Apache Spark Developers List ma

Re: Standardized Distance Functions in MLlib

2014-10-08 Thread Yu Ishikawa
how to embed > distance measures there. All right. I will check it. thanks, Yu Ishikawa - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Standardized-Distance-Functions-in-MLlib-tp8697p8711.html Sent from the Apache Spark Dev

[mllib] Share the simple benchmark result about the cast cost from Spark vector to Breeze vector

2014-10-15 Thread Yu Ishikawa
had expected. For more information, please read the below report, if you are interested in it. https://github.com/yu-iskw/benchmark-breeze-on-spark/blob/master/doc%2Fbenchmark-result.md Best, Yu Ishikawa - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list