Peng Meng created SPARK-17017:
-
Summary: Add a chiSquare Selector based on False Positive Rate
(FPR) test
Key: SPARK-17017
URL: https://issues.apache.org/jira/browse/SPARK-17017
Project: Spark
[
https://issues.apache.org/jira/browse/SPARK-17017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peng Meng updated SPARK-17017:
--
Affects Version/s: (was: 2.0.0)
> Add a chiSquare Selector based on False Positive Rate (FPR) test
[
https://issues.apache.org/jira/browse/SPARK-17017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peng Meng updated SPARK-17017:
--
Target Version/s: (was: 2.1.0)
> Add a chiSquare Selector based on False Positive Rate (FPR) test
>
Peng Meng created SPARK-16843:
-
Summary: Select features according to a percentile of the highest
scores of ChiSqSelector
Key: SPARK-16843
URL: https://issues.apache.org/jira/browse/SPARK-16843
Project:
[
https://issues.apache.org/jira/browse/SPARK-16843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peng Meng updated SPARK-16843:
--
Affects Version/s: (was: 2.0.0)
2.1.0
> Select features according to a
[
https://issues.apache.org/jira/browse/SPARK-16843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peng Meng updated SPARK-16843:
--
Target Version/s: (was: 2.0.1)
> Select features according to a percentile of the highest scores of
[
https://issues.apache.org/jira/browse/SPARK-16843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peng Meng updated SPARK-16843:
--
Priority: Minor (was: Major)
> Select features according to a percentile of the highest scores of
>
[
https://issues.apache.org/jira/browse/SPARK-16843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peng Meng updated SPARK-16843:
--
Fix Version/s: (was: 2.0.1)
2.1.0
> Select features according to a percentile
[
https://issues.apache.org/jira/browse/SPARK-17207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15434527#comment-15434527
]
Peng Meng commented on SPARK-17207:
---
Thanks Owen, I am testing the code with array length check. will
[
https://issues.apache.org/jira/browse/SPARK-17207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436874#comment-15436874
]
Peng Meng commented on SPARK-17207:
---
Hi,
[
https://issues.apache.org/jira/browse/SPARK-17207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437013#comment-15437013
]
Peng Meng commented on SPARK-17207:
---
Ok, thanks. I will fix CountVectorizerSuite test error in this PR.
[
https://issues.apache.org/jira/browse/SPARK-17207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436986#comment-15436986
]
Peng Meng commented on SPARK-17207:
---
This is the bug information:
[
https://issues.apache.org/jira/browse/SPARK-17462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15477019#comment-15477019
]
Peng Meng commented on SPARK-17462:
---
Hi [~josephkb], will you work on this, if not, I can work on it.
[
https://issues.apache.org/jira/browse/SPARK-17462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15483336#comment-15483336
]
Peng Meng commented on SPARK-17462:
---
hi [~josephkb], I am busy this days, I am glad VinceShieh can help
Peng Meng created SPARK-17505:
-
Summary: Add setBins for BinaryClassificationMetrics in
mlllb/evaluation
Key: SPARK-17505
URL: https://issues.apache.org/jira/browse/SPARK-17505
Project: Spark
[
https://issues.apache.org/jira/browse/SPARK-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15475769#comment-15475769
]
Peng Meng commented on SPARK-6160:
--
Hi [~josephkb], I have some discussion with [~srowen] about keeping
[
https://issues.apache.org/jira/browse/SPARK-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15475709#comment-15475709
]
Peng Meng commented on SPARK-6160:
--
hi Joseph K. Bradley
> ChiSqSelector should keep test statistic info
[
https://issues.apache.org/jira/browse/SPARK-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peng Meng updated SPARK-6160:
-
Comment: was deleted
(was: hi Joseph K. Bradley)
> ChiSqSelector should keep test statistic info
>
[
https://issues.apache.org/jira/browse/SPARK-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15475860#comment-15475860
]
Peng Meng commented on SPARK-6160:
--
hi [~GayathriMurali], are you still working on this, if not, I can
Peng Meng created SPARK-17645:
-
Summary: Add feature selector methods based on: False Discovery
Rate (FDR) and Family Wise Error rate (FWE)
Key: SPARK-17645
URL: https://issues.apache.org/jira/browse/SPARK-17645
[
https://issues.apache.org/jira/browse/SPARK-17207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15434241#comment-15434241
]
Peng Meng commented on SPARK-17207:
---
This is caused by two Vector zip problem:
def absTol(eps:
Peng Meng created SPARK-17207:
-
Summary: Comparing Vector in relative tolerance or absolute
tolerance in UnitTests error
Key: SPARK-17207
URL: https://issues.apache.org/jira/browse/SPARK-17207
Project:
[
https://issues.apache.org/jira/browse/SPARK-17207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15434241#comment-15434241
]
Peng Meng edited comment on SPARK-17207 at 8/24/16 6:20 AM:
This is caused by
[
https://issues.apache.org/jira/browse/SPARK-18062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15600926#comment-15600926
]
Peng Meng commented on SPARK-18062:
---
This relate to how to understand all-0 rawPrediction, all classes
[
https://issues.apache.org/jira/browse/SPARK-18088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15605211#comment-15605211
]
Peng Meng commented on SPARK-18088:
---
Hi [~josephkb] , I am not quite understand "Testing against only
[
https://issues.apache.org/jira/browse/SPARK-18088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15605231#comment-15605231
]
Peng Meng commented on SPARK-18088:
---
In the previous implementation, testing against only the
[
https://issues.apache.org/jira/browse/SPARK-18088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15608322#comment-15608322
]
Peng Meng commented on SPARK-18088:
---
I am neutral for changing the selectorType "KBest" to
[
https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peng Meng updated SPARK-17870:
--
Summary: ML/MLLIB: ChiSquareSelector based on Statistics.chiSqTest(RDD) is
wrong (was: ML/MLLIB:
[
https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565225#comment-15565225
]
Peng Meng commented on SPARK-17870:
---
yes, the selectKBest and selectPercentile in scikit learn only use
[
https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565251#comment-15565251
]
Peng Meng commented on SPARK-17870:
---
The scikit learn code is here:
[
https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565315#comment-15565315
]
Peng Meng commented on SPARK-17870:
---
https://github.com/apache/spark/pull/1484#issuecomment-51024568
Hi
[
https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565041#comment-15565041
]
Peng Meng commented on SPARK-17870:
---
hi [~srowen], thanks very much for you quickly reply.
yes,the
[
https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567170#comment-15567170
]
Peng Meng commented on SPARK-17870:
---
hi [~avulanov], the question here is not use raw chi2 scores or
Peng Meng created SPARK-17870:
-
Summary: ML/MLLIB: Statistics.chiSqTest(RDD) is wrong
Key: SPARK-17870
URL: https://issues.apache.org/jira/browse/SPARK-17870
Project: Spark
Issue Type: Bug
Peng Meng created SPARK-20443:
-
Summary: The blockSize of MLLIB ALS should be setting by the User
Key: SPARK-20443
URL: https://issues.apache.org/jira/browse/SPARK-20443
Project: Spark
Issue
[
https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981148#comment-15981148
]
Peng Meng edited comment on SPARK-20446 at 4/24/17 4:18 PM:
Thanks [~mlnick],
[
https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982251#comment-15982251
]
Peng Meng commented on SPARK-20446:
---
Yes, I compared with ML ALSModel.recommendAll. The data size is
Peng Meng created SPARK-21623:
-
Summary: Comments of parentStats on
ml/tree/impl/DTStatsAggregator.scala is wrong
Key: SPARK-21623
URL: https://issues.apache.org/jira/browse/SPARK-21623
Project: Spark
Peng Meng created SPARK-21624:
-
Summary: Optimize communication cost of RF/GBT/DT
Key: SPARK-21624
URL: https://issues.apache.org/jira/browse/SPARK-21624
Project: Spark
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/SPARK-21624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16112374#comment-16112374
]
Peng Meng commented on SPARK-21624:
---
ping [~josephkb] [~srowen] [~yanboliang] [~mlnick] [~yuhaoyan]
>
[
https://issues.apache.org/jira/browse/SPARK-21624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113793#comment-16113793
]
Peng Meng commented on SPARK-21624:
---
Thanks [~mlnick], use Vector and compress is reasonable. I will
[
https://issues.apache.org/jira/browse/SPARK-21638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114132#comment-16114132
]
Peng Meng commented on SPARK-21638:
---
This is because "we not add the node to mutableNodesForGroup, but
[
https://issues.apache.org/jira/browse/SPARK-21638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114153#comment-16114153
]
Peng Meng commented on SPARK-21638:
---
In the example warning message, the split node shoud be 2621;
>
[
https://issues.apache.org/jira/browse/SPARK-21638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peng Meng updated SPARK-21638:
--
Description:
When train RF model, there is many warning message like this:
{quote}WARN RandomForest:
[
https://issues.apache.org/jira/browse/SPARK-21638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114133#comment-16114133
]
Peng Meng commented on SPARK-21638:
---
I will be back home now, will answer your question next week.
[
https://issues.apache.org/jira/browse/SPARK-21638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114156#comment-16114156
]
Peng Meng commented on SPARK-21638:
---
The first data should - nodeMemUsage
> Warning message of RF is
Peng Meng created SPARK-21638:
-
Summary: Warning message of RF is not accurate
Key: SPARK-21638
URL: https://issues.apache.org/jira/browse/SPARK-21638
Project: Spark
Issue Type: Bug
[
https://issues.apache.org/jira/browse/SPARK-21638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peng Meng updated SPARK-21638:
--
Description:
When train RF model, there is many warning message like this:
{quote}WARN RandomForest:
[
https://issues.apache.org/jira/browse/SPARK-21680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120115#comment-16120115
]
Peng Meng commented on SPARK-21680:
---
Then we will have two toSparse:
toSparse
and
toSparse(size)
Do
[
https://issues.apache.org/jira/browse/SPARK-21680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121175#comment-16121175
]
Peng Meng commented on SPARK-21680:
---
I mean if the user call toSparse(size), but the size is smaller
[
https://issues.apache.org/jira/browse/SPARK-21688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121284#comment-16121284
]
Peng Meng commented on SPARK-21688:
---
MKL is just an example of native BLAS, if user has Openblas,
[
https://issues.apache.org/jira/browse/SPARK-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085738#comment-16085738
]
Peng Meng commented on SPARK-21401:
---
Yes, SPARK-21389 used pq.poll. pq.poll is just a small part of
[
https://issues.apache.org/jira/browse/SPARK-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peng Meng updated SPARK-21401:
--
Description:
The most of BoundedPriorityQueue usages in ML/MLLIB are:
Get the value of
[
https://issues.apache.org/jira/browse/SPARK-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085874#comment-16085874
]
Peng Meng commented on SPARK-21401:
---
Sure, I will add isEmpty and maybe some other functions, and tests
[
https://issues.apache.org/jira/browse/SPARK-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089648#comment-16089648
]
Peng Meng commented on SPARK-21401:
---
Yes, you don't need to do it for the vast majority of elements.
[
https://issues.apache.org/jira/browse/SPARK-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089656#comment-16089656
]
Peng Meng commented on SPARK-21401:
---
Got it, thanks [~srowen]
> add poll function for
[
https://issues.apache.org/jira/browse/SPARK-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089629#comment-16089629
]
Peng Meng commented on SPARK-21401:
---
Thanks @srowen.
I mean for BoundedPriorityQueue, you also can
[
https://issues.apache.org/jira/browse/SPARK-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089638#comment-16089638
]
Peng Meng commented on SPARK-21401:
---
I mean we totally rewrite the BoundedPriorityQueue, not use Java
[
https://issues.apache.org/jira/browse/SPARK-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089437#comment-16089437
]
Peng Meng commented on SPARK-21401:
---
I benchmarking just change pq.toArray.sorted. and pq.poll.
pq.poll
[
https://issues.apache.org/jira/browse/SPARK-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089586#comment-16089586
]
Peng Meng edited comment on SPARK-21401 at 7/17/17 10:10 AM:
-
I have tested
[
https://issues.apache.org/jira/browse/SPARK-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089424#comment-16089424
]
Peng Meng commented on SPARK-21401:
---
Hi [~srowen], for ALS optimization, the difference of using poll
[
https://issues.apache.org/jira/browse/SPARK-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089578#comment-16089578
]
Peng Meng commented on SPARK-21401:
---
Hi [~srowen], I got why my original test pq.toArray.sorted is very
[
https://issues.apache.org/jira/browse/SPARK-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089586#comment-16089586
]
Peng Meng commented on SPARK-21401:
---
I have tested much about poll and toArray.sorted.
If the queue is
[
https://issues.apache.org/jira/browse/SPARK-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089606#comment-16089606
]
Peng Meng commented on SPARK-21401:
---
I think the BoundedPriorityQueue should be rewritten. there are
Peng Meng created SPARK-21389:
-
Summary: ALS recommendForAll optimization uses Native BLAS
Key: SPARK-21389
URL: https://issues.apache.org/jira/browse/SPARK-21389
Project: Spark
Issue Type:
[
https://issues.apache.org/jira/browse/SPARK-21389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peng Meng updated SPARK-21389:
--
Description:
In Spark 2.2, we have optimized ALS recommendForAll, which uses a handwriting
matrix
[
https://issues.apache.org/jira/browse/SPARK-21389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peng Meng updated SPARK-21389:
--
Description:
In Spark 2.2, we have optimized ALS recommendForAll, which uses a handwriting
matrix
Peng Meng created SPARK-21401:
-
Summary: add poll function for BoundedPriorityQueue
Key: SPARK-21401
URL: https://issues.apache.org/jira/browse/SPARK-21401
Project: Spark
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/SPARK-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089209#comment-16089209
]
Peng Meng commented on SPARK-21401:
---
Hi [~srowen], here we also want to get a fully sorted list by get
[
https://issues.apache.org/jira/browse/SPARK-21476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094266#comment-16094266
]
Peng Meng commented on SPARK-21476:
---
Seems transform should use transformImpl but not use?
>
[
https://issues.apache.org/jira/browse/SPARK-21476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094603#comment-16094603
]
Peng Meng commented on SPARK-21476:
---
I am optimizing RF and GBT these days, if no one works on it. I
[
https://issues.apache.org/jira/browse/SPARK-21476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16101476#comment-16101476
]
Peng Meng commented on SPARK-21476:
---
Hi [~sagraw], could you please test copy pasted the transform
[
https://issues.apache.org/jira/browse/SPARK-21476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16101476#comment-16101476
]
Peng Meng edited comment on SPARK-21476 at 7/26/17 10:06 AM:
-
Hi [~sagraw],
[
https://issues.apache.org/jira/browse/SPARK-21476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16101253#comment-16101253
]
Peng Meng commented on SPARK-21476:
---
Not each transform uses broadcast, do you have some experiment
[
https://issues.apache.org/jira/browse/SPARK-21476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16099593#comment-16099593
]
Peng Meng commented on SPARK-21476:
---
Hi @Suarabh, I am profiling RF transform performance. I change
[
https://issues.apache.org/jira/browse/SPARK-21476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16099593#comment-16099593
]
Peng Meng edited comment on SPARK-21476 at 7/25/17 6:55 AM:
Hi @Suarabh, I am
[
https://issues.apache.org/jira/browse/SPARK-2465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094106#comment-16094106
]
Peng Meng commented on SPARK-2465:
--
I think it is time to revisit this now. Some of our customers, such
Peng Meng created SPARK-21305:
-
Summary: The BKM (best known methods) of using native BLAS to
improvement ML/MLLIB performance
Key: SPARK-21305
URL: https://issues.apache.org/jira/browse/SPARK-21305
[
https://issues.apache.org/jira/browse/SPARK-21305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073801#comment-16073801
]
Peng Meng commented on SPARK-21305:
---
yes, I will do that.
Because different blas, the method to
[
https://issues.apache.org/jira/browse/SPARK-21305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073798#comment-16073798
]
Peng Meng commented on SPARK-21305:
---
ping [~mlnick] , [~yanboliang], [~mengxr], [~srowen]
> The BKM
[
https://issues.apache.org/jira/browse/SPARK-21305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073801#comment-16073801
]
Peng Meng edited comment on SPARK-21305 at 7/4/17 3:39 PM:
---
yes, I will do
[
https://issues.apache.org/jira/browse/SPARK-21305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074320#comment-16074320
]
Peng Meng commented on SPARK-21305:
---
Thanks [~srowen] and [~yanboliang]
I will disable native BLAS MT
[
https://issues.apache.org/jira/browse/SPARK-21305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16076152#comment-16076152
]
Peng Meng commented on SPARK-21305:
---
I tested Intel MKL and OpenBLAS by ALS Train and Prediction.
ALS
[
https://issues.apache.org/jira/browse/SPARK-21305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16076152#comment-16076152
]
Peng Meng edited comment on SPARK-21305 at 7/6/17 8:22 AM:
---
I tested Intel MKL
[
https://issues.apache.org/jira/browse/SPARK-20443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peng Meng updated SPARK-20443:
--
Description:
The blockSize of MLLIB ALS is very important for ALS performance.
In our test, when the
Peng Meng created SPARK-20446:
-
Summary: Optimize the process of MLLIB ALS recommendForAll
Key: SPARK-20446
URL: https://issues.apache.org/jira/browse/SPARK-20446
Project: Spark
Issue Type:
[
https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peng Meng updated SPARK-20446:
--
Description:
The recommendForAll of MLLIB ALS is very slow.
GC is a key problem of the current method.
[
https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981115#comment-15981115
]
Peng Meng commented on SPARK-20446:
---
I think you said: https://github.com/apache/spark/pull/9980
Maybe
[
https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981148#comment-15981148
]
Peng Meng commented on SPARK-20446:
---
Thanks [~mlnick], I also compared DataFrame Version ALS
[
https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983030#comment-15983030
]
Peng Meng commented on SPARK-20446:
---
Thanks [~mlnick] , I agree with you. I am ok to close this ticket
[
https://issues.apache.org/jira/browse/SPARK-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982251#comment-15982251
]
Peng Meng edited comment on SPARK-20446 at 4/25/17 3:06 PM:
Yes, I compared
[
https://issues.apache.org/jira/browse/SPARK-20443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983059#comment-15983059
]
Peng Meng commented on SPARK-20443:
---
Yes, based on my current test, I agree.
But if the data size is
[
https://issues.apache.org/jira/browse/SPARK-11968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983074#comment-15983074
]
Peng Meng commented on SPARK-11968:
---
Thanks [~mlnick] , I will post more results here.
I latest result
[
https://issues.apache.org/jira/browse/SPARK-21680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120136#comment-16120136
]
Peng Meng commented on SPARK-21680:
---
Ok, thanks, I will submit a PR.
> ML/MLLIB Vector compressed
[
https://issues.apache.org/jira/browse/SPARK-21624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120105#comment-16120105
]
Peng Meng commented on SPARK-21624:
---
Hi [~mlnick], how do you think about this:
Peng Meng created SPARK-21680:
-
Summary: ML/MLLIB Vector compressed optimization
Key: SPARK-21680
URL: https://issues.apache.org/jira/browse/SPARK-21680
Project: Spark
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/SPARK-21680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120947#comment-16120947
]
Peng Meng commented on SPARK-21680:
---
Hi [~srowen], if add toSparse(size), for secure reason, it is
[
https://issues.apache.org/jira/browse/SPARK-21624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peng Meng updated SPARK-21624:
--
Description:
{quote}The implementation of RF is bound by either the cost of statistics
computation
[
https://issues.apache.org/jira/browse/SPARK-21638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peng Meng updated SPARK-21638:
--
Description:
When train RF model, there is many warning message like this:
{quote}WARN RandomForest:
[
https://issues.apache.org/jira/browse/SPARK-20764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16020662#comment-16020662
]
Peng Meng commented on SPARK-20764:
---
I will submit a PR to cover more tests for model summary, thanks.
1 - 100 of 117 matches
Mail list logo