[jira] [Created] (SPARK-8341) Significant selector feature transformation

2015-06-13 Thread Kirill A. Korinskiy (JIRA)
Kirill A. Korinskiy created SPARK-8341:
--

 Summary: Significant selector feature transformation
 Key: SPARK-8341
 URL: https://issues.apache.org/jira/browse/SPARK-8341
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Kirill A. Korinskiy
Priority: Minor


Idea of this transformation it safe reduce big vector that was produced by 
Hashing TF for example
for reduce requirement of memory for manipulation on them. 

This transformation create a model that keep only indices that has different 
values on fit stage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6244) Implement VectorSpace to easy create a complicated feature vector

2015-04-10 Thread Kirill A. Korinskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14490714#comment-14490714
 ] 

Kirill A. Korinskiy commented on SPARK-6244:


Sean, sorry for long response.

The idea for use this methods is construct complicated vector.

In my data model I have few object with complicated structure (like enums, 
numbers, other types) and I create a vector that describe relationship between 
this objects.

For example, let's image a two object: candidate for position and position. 
Position has Location and With in, candidate has Location also and Willing to 
Relocate.

So, now I use this method describe relationship as
{code}
new VectorSpace()
  .add(new VectorSpace().add(position.location).add(position.with_in).sum)
  .scaled(-1d)
  .add(new 
VectorSpace().add(candidate.location).add(candidate.witling_to_relocate).sum)
  .sum
{code}

I have a lot of similar part of vectors that I convert to single vector by 
concat.

 Implement VectorSpace to easy create a complicated feature vector
 -

 Key: SPARK-6244
 URL: https://issues.apache.org/jira/browse/SPARK-6244
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Kirill A. Korinskiy
Priority: Minor

 VectorSpace is wrapper what implement three operation:
  - concat -- concat all vectors to single vector
  - sum -- sum of vectors
  - scaled -- multiple scalar to vector
  
 Example of usage:
 ```
 import org.apache.spark.mllib.linalg.Vectors
 import org.apache.spark.mllib.linalg.VectorSpace
 // Create a new Vector Space with one dense vector.
 val vs = VectorSpace.create(Vectors.dense(1.0, 0.0, 3.0))
 // Add a to vector space a scaled vector space
 val vs2 = vs.add(vs.scaled(-1d))
 // concat vectors from vector space, result: Vectors.dense(1.0, 0.0, 3.0, 
 -1.0, 0.0, -3.0)
 val concat = vs2.concat
 // take a sum from vector space, result: Vectors.dense(0.0, 0.0, 0.0)
 val sum = vs2.sum
 ```
 This wrapper is very useful when create a complicated feature vector from 
 structured objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-6244) Implement VectorSpace to easy create a complicated feature vector

2015-03-10 Thread Kirill A. Korinskiy (JIRA)
Kirill A. Korinskiy created SPARK-6244:
--

 Summary: Implement VectorSpace to easy create a complicated 
feature vector
 Key: SPARK-6244
 URL: https://issues.apache.org/jira/browse/SPARK-6244
 Project: Spark
  Issue Type: New Feature
Reporter: Kirill A. Korinskiy


VectorSpace is wrapper what implement three operation:
 - concat -- concat all vectors to single vector
 - sum -- sum of vectors
 - scaled -- multiple scalar to vector
 
Example of usage:
```
import org.apache.spark.mllib.linalg.Vectors
import org.apache.spark.mllib.linalg.VectorSpace

// Create a new Vector Space with one dense vector.
val vs = VectorSpace.create(Vectors.dense(1.0, 0.0, 3.0))

// Add a to vector space a scaled vector space
val vs2 = vs.add(vs.scaled(-1d))

// concat vectors from vector space, result: Vectors.dense(1.0, 0.0, 3.0, -1.0, 
0.0, -3.0)
val concat = vs2.concat

// take a sum from vector space, result: Vectors.dense(0.0, 0.0, 0.0)
val sum = vs2.sum
```

This wrapper is very useful when create a complicated feature vector from 
structured objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6244) Implement VectorSpace to easy create a complicated feature vector

2015-03-10 Thread Kirill A. Korinskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354819#comment-14354819
 ] 

Kirill A. Korinskiy commented on SPARK-6244:


Yes, I agree with you that name Vector Space mayn't correct for this wrapper 
and list of vectors sounds better.

I've checked breeze and found vertcat, but this operation support same type of 
vector.

In my case I create a feature vector from sparse and dense vectors.

 Implement VectorSpace to easy create a complicated feature vector
 -

 Key: SPARK-6244
 URL: https://issues.apache.org/jira/browse/SPARK-6244
 Project: Spark
  Issue Type: New Feature
Reporter: Kirill A. Korinskiy

 VectorSpace is wrapper what implement three operation:
  - concat -- concat all vectors to single vector
  - sum -- sum of vectors
  - scaled -- multiple scalar to vector
  
 Example of usage:
 ```
 import org.apache.spark.mllib.linalg.Vectors
 import org.apache.spark.mllib.linalg.VectorSpace
 // Create a new Vector Space with one dense vector.
 val vs = VectorSpace.create(Vectors.dense(1.0, 0.0, 3.0))
 // Add a to vector space a scaled vector space
 val vs2 = vs.add(vs.scaled(-1d))
 // concat vectors from vector space, result: Vectors.dense(1.0, 0.0, 3.0, 
 -1.0, 0.0, -3.0)
 val concat = vs2.concat
 // take a sum from vector space, result: Vectors.dense(0.0, 0.0, 0.0)
 val sum = vs2.sum
 ```
 This wrapper is very useful when create a complicated feature vector from 
 structured objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Issue Comment Deleted] (SPARK-6244) Implement VectorSpace to easy create a complicated feature vector

2015-03-10 Thread Kirill A. Korinskiy (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirill A. Korinskiy updated SPARK-6244:
---
Comment: was deleted

(was: Yes, I agree with you that name Vector Space mayn't correct for this 
wrapper and list of vectors sounds better.

I've checked breeze and found vertcat, but this operation support same type of 
vector.

In my case I create a feature vector from sparse and dense vectors.)

 Implement VectorSpace to easy create a complicated feature vector
 -

 Key: SPARK-6244
 URL: https://issues.apache.org/jira/browse/SPARK-6244
 Project: Spark
  Issue Type: New Feature
Reporter: Kirill A. Korinskiy

 VectorSpace is wrapper what implement three operation:
  - concat -- concat all vectors to single vector
  - sum -- sum of vectors
  - scaled -- multiple scalar to vector
  
 Example of usage:
 ```
 import org.apache.spark.mllib.linalg.Vectors
 import org.apache.spark.mllib.linalg.VectorSpace
 // Create a new Vector Space with one dense vector.
 val vs = VectorSpace.create(Vectors.dense(1.0, 0.0, 3.0))
 // Add a to vector space a scaled vector space
 val vs2 = vs.add(vs.scaled(-1d))
 // concat vectors from vector space, result: Vectors.dense(1.0, 0.0, 3.0, 
 -1.0, 0.0, -3.0)
 val concat = vs2.concat
 // take a sum from vector space, result: Vectors.dense(0.0, 0.0, 0.0)
 val sum = vs2.sum
 ```
 This wrapper is very useful when create a complicated feature vector from 
 structured objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6244) Implement VectorSpace to easy create a complicated feature vector

2015-03-10 Thread Kirill A. Korinskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354820#comment-14354820
 ] 

Kirill A. Korinskiy commented on SPARK-6244:


Yes, I agree with you that name Vector Space mayn't correct for this wrapper 
and list of vectors sounds better.

I've checked breeze and found vertcat, but this operation support same type of 
vector.

In my case I create a feature vector from sparse and dense vectors.

 Implement VectorSpace to easy create a complicated feature vector
 -

 Key: SPARK-6244
 URL: https://issues.apache.org/jira/browse/SPARK-6244
 Project: Spark
  Issue Type: New Feature
Reporter: Kirill A. Korinskiy

 VectorSpace is wrapper what implement three operation:
  - concat -- concat all vectors to single vector
  - sum -- sum of vectors
  - scaled -- multiple scalar to vector
  
 Example of usage:
 ```
 import org.apache.spark.mllib.linalg.Vectors
 import org.apache.spark.mllib.linalg.VectorSpace
 // Create a new Vector Space with one dense vector.
 val vs = VectorSpace.create(Vectors.dense(1.0, 0.0, 3.0))
 // Add a to vector space a scaled vector space
 val vs2 = vs.add(vs.scaled(-1d))
 // concat vectors from vector space, result: Vectors.dense(1.0, 0.0, 3.0, 
 -1.0, 0.0, -3.0)
 val concat = vs2.concat
 // take a sum from vector space, result: Vectors.dense(0.0, 0.0, 0.0)
 val sum = vs2.sum
 ```
 This wrapper is very useful when create a complicated feature vector from 
 structured objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Issue Comment Deleted] (SPARK-6244) Implement VectorSpace to easy create a complicated feature vector

2015-03-10 Thread Kirill A. Korinskiy (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirill A. Korinskiy updated SPARK-6244:
---
Comment: was deleted

(was: Yes, this way sounds good.

I can use same issue and pull request or I must create a new one?)

 Implement VectorSpace to easy create a complicated feature vector
 -

 Key: SPARK-6244
 URL: https://issues.apache.org/jira/browse/SPARK-6244
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Kirill A. Korinskiy
Priority: Minor

 VectorSpace is wrapper what implement three operation:
  - concat -- concat all vectors to single vector
  - sum -- sum of vectors
  - scaled -- multiple scalar to vector
  
 Example of usage:
 ```
 import org.apache.spark.mllib.linalg.Vectors
 import org.apache.spark.mllib.linalg.VectorSpace
 // Create a new Vector Space with one dense vector.
 val vs = VectorSpace.create(Vectors.dense(1.0, 0.0, 3.0))
 // Add a to vector space a scaled vector space
 val vs2 = vs.add(vs.scaled(-1d))
 // concat vectors from vector space, result: Vectors.dense(1.0, 0.0, 3.0, 
 -1.0, 0.0, -3.0)
 val concat = vs2.concat
 // take a sum from vector space, result: Vectors.dense(0.0, 0.0, 0.0)
 val sum = vs2.sum
 ```
 This wrapper is very useful when create a complicated feature vector from 
 structured objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6244) Implement VectorSpace to easy create a complicated feature vector

2015-03-10 Thread Kirill A. Korinskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356164#comment-14356164
 ] 

Kirill A. Korinskiy commented on SPARK-6244:


Yes, this way sounds good.

I can use same issue and pull request or I must create a new one?

 Implement VectorSpace to easy create a complicated feature vector
 -

 Key: SPARK-6244
 URL: https://issues.apache.org/jira/browse/SPARK-6244
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Kirill A. Korinskiy
Priority: Minor

 VectorSpace is wrapper what implement three operation:
  - concat -- concat all vectors to single vector
  - sum -- sum of vectors
  - scaled -- multiple scalar to vector
  
 Example of usage:
 ```
 import org.apache.spark.mllib.linalg.Vectors
 import org.apache.spark.mllib.linalg.VectorSpace
 // Create a new Vector Space with one dense vector.
 val vs = VectorSpace.create(Vectors.dense(1.0, 0.0, 3.0))
 // Add a to vector space a scaled vector space
 val vs2 = vs.add(vs.scaled(-1d))
 // concat vectors from vector space, result: Vectors.dense(1.0, 0.0, 3.0, 
 -1.0, 0.0, -3.0)
 val concat = vs2.concat
 // take a sum from vector space, result: Vectors.dense(0.0, 0.0, 0.0)
 val sum = vs2.sum
 ```
 This wrapper is very useful when create a complicated feature vector from 
 structured objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6244) Implement VectorSpace to easy create a complicated feature vector

2015-03-10 Thread Kirill A. Korinskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356165#comment-14356165
 ] 

Kirill A. Korinskiy commented on SPARK-6244:


Yes, this way sounds good.

I can use same issue and pull request or I must create a new one?

 Implement VectorSpace to easy create a complicated feature vector
 -

 Key: SPARK-6244
 URL: https://issues.apache.org/jira/browse/SPARK-6244
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Kirill A. Korinskiy
Priority: Minor

 VectorSpace is wrapper what implement three operation:
  - concat -- concat all vectors to single vector
  - sum -- sum of vectors
  - scaled -- multiple scalar to vector
  
 Example of usage:
 ```
 import org.apache.spark.mllib.linalg.Vectors
 import org.apache.spark.mllib.linalg.VectorSpace
 // Create a new Vector Space with one dense vector.
 val vs = VectorSpace.create(Vectors.dense(1.0, 0.0, 3.0))
 // Add a to vector space a scaled vector space
 val vs2 = vs.add(vs.scaled(-1d))
 // concat vectors from vector space, result: Vectors.dense(1.0, 0.0, 3.0, 
 -1.0, 0.0, -3.0)
 val concat = vs2.concat
 // take a sum from vector space, result: Vectors.dense(0.0, 0.0, 0.0)
 val sum = vs2.sum
 ```
 This wrapper is very useful when create a complicated feature vector from 
 structured objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-5673) Implement Streaming wrapper for all linear methos

2015-02-08 Thread Kirill A. Korinskiy (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirill A. Korinskiy updated SPARK-5673:
---
Component/s: MLlib

 Implement Streaming wrapper for all linear methos
 -

 Key: SPARK-5673
 URL: https://issues.apache.org/jira/browse/SPARK-5673
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Kirill A. Korinskiy

 Now spark had streaming wrapper for Logistic and Linear regressions only.
 So, implement wrapper for SVM, Lasso and Ridge Regression will make streaming 
 fashion more useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-5673) Implement Streaming wrapper for all linear methos

2015-02-07 Thread Kirill A. Korinskiy (JIRA)
Kirill A. Korinskiy created SPARK-5673:
--

 Summary: Implement Streaming wrapper for all linear methos
 Key: SPARK-5673
 URL: https://issues.apache.org/jira/browse/SPARK-5673
 Project: Spark
  Issue Type: New Feature
Reporter: Kirill A. Korinskiy


Now spark had only streaming wrapper for Logistic and Linear regressions only.

So, implement wrapper for SVM, Lasso and Ridge Regression will make streaming 
fashion more useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-5672) Don't return `ERROR 500` when have missing args

2015-02-07 Thread Kirill A. Korinskiy (JIRA)
Kirill A. Korinskiy created SPARK-5672:
--

 Summary: Don't return `ERROR 500` when have missing args
 Key: SPARK-5672
 URL: https://issues.apache.org/jira/browse/SPARK-5672
 Project: Spark
  Issue Type: Bug
  Components: Web UI
Reporter: Kirill A. Korinskiy


Spark web UI return HTTP ERROR 500 when GET arguments is missing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-5673) Implement Streaming wrapper for all linear methos

2015-02-07 Thread Kirill A. Korinskiy (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirill A. Korinskiy updated SPARK-5673:
---
Description: 
Now spark had streaming wrapper for Logistic and Linear regressions only.

So, implement wrapper for SVM, Lasso and Ridge Regression will make streaming 
fashion more useful.

  was:
Now spark had only streaming wrapper for Logistic and Linear regressions only.

So, implement wrapper for SVM, Lasso and Ridge Regression will make streaming 
fashion more useful.


 Implement Streaming wrapper for all linear methos
 -

 Key: SPARK-5673
 URL: https://issues.apache.org/jira/browse/SPARK-5673
 Project: Spark
  Issue Type: New Feature
Reporter: Kirill A. Korinskiy

 Now spark had streaming wrapper for Logistic and Linear regressions only.
 So, implement wrapper for SVM, Lasso and Ridge Regression will make streaming 
 fashion more useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-5521) PCA wrapper for easy transform vectors

2015-02-01 Thread Kirill A. Korinskiy (JIRA)
Kirill A. Korinskiy created SPARK-5521:
--

 Summary: PCA wrapper for easy transform vectors
 Key: SPARK-5521
 URL: https://issues.apache.org/jira/browse/SPARK-5521
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: Kirill A. Korinskiy


Implement a simple PCA wrapper for easy transform of vectors by PCA for example 
LabeledPoint or another complicated structure.

Now all PCA transformation may take only matrix and haven't got any way to take 
project from vectors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org