[jira] [Commented] (SPARK-23266) Matrix Inversion on BlockMatrix
[ https://issues.apache.org/jira/browse/SPARK-23266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688225#comment-16688225 ] Jose Saray commented on SPARK-23266: Hi Chandan, we have a chat by mail in June, I think you did not answer this question, maybe you could do it here, and at the same time, we can all be aware about the scope of your algorithm : June 4th, 2018 : "...I just see in your documentation that to be invertible by your algorithm, the matrix must be positive definite. I am not very knowledgable on linear algebra, but this means that Strassen algorithm, or your algorithm, cannot invert any type of invertible matrix, are the input of your algorithm constrained by this positive - definite requirement..."? > Matrix Inversion on BlockMatrix > --- > > Key: SPARK-23266 > URL: https://issues.apache.org/jira/browse/SPARK-23266 > Project: Spark > Issue Type: New Feature > Components: MLlib >Affects Versions: 2.2.1 >Reporter: Chandan Misra >Priority: Minor > > Matrix inversion is the basic building block for many other algorithms like > regression, classification, geostatistical analysis using ordinary kriging > etc. A simple Spark BlockMatrix based efficient distributed > divide-and-conquer algorithm can be implemented using only *6* > multiplications in each recursion level of the algorithm. The reference paper > can be found in > [https://arxiv.org/abs/1801.04723] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23266) Matrix Inversion on BlockMatrix
[ https://issues.apache.org/jira/browse/SPARK-23266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16496306#comment-16496306 ] Chandan Misra commented on SPARK-23266: --- I want to add this feature in any of the coming versions. Kindly let me know how this can be done. > Matrix Inversion on BlockMatrix > --- > > Key: SPARK-23266 > URL: https://issues.apache.org/jira/browse/SPARK-23266 > Project: Spark > Issue Type: New Feature > Components: MLlib >Affects Versions: 2.2.1 >Reporter: Chandan Misra >Priority: Minor > > Matrix inversion is the basic building block for many other algorithms like > regression, classification, geostatistical analysis using ordinary kriging > etc. A simple Spark BlockMatrix based efficient distributed > divide-and-conquer algorithm can be implemented using only *6* > multiplications in each recursion level of the algorithm. The reference paper > can be found in > [https://arxiv.org/abs/1801.04723] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23266) Matrix Inversion on BlockMatrix
[ https://issues.apache.org/jira/browse/SPARK-23266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16482159#comment-16482159 ] renfangming commented on SPARK-23266: - [~chandan-misra] Hello,Do you add this feature in spark 2.3.0 ? > Matrix Inversion on BlockMatrix > --- > > Key: SPARK-23266 > URL: https://issues.apache.org/jira/browse/SPARK-23266 > Project: Spark > Issue Type: New Feature > Components: MLlib >Affects Versions: 2.2.1 >Reporter: Chandan Misra >Priority: Minor > > Matrix inversion is the basic building block for many other algorithms like > regression, classification, geostatistical analysis using ordinary kriging > etc. A simple Spark BlockMatrix based efficient distributed > divide-and-conquer algorithm can be implemented using only *6* > multiplications in each recursion level of the algorithm. The reference paper > can be found in > [https://arxiv.org/abs/1801.04723] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23266) Matrix Inversion on BlockMatrix
[ https://issues.apache.org/jira/browse/SPARK-23266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16389061#comment-16389061 ] Chandan Misra commented on SPARK-23266: --- I have implemented matrix inversion using Spark version 2.2.0. Though the implementation can be executed using Spark version 2.0.0 onwards. It would be really helpful if the inversion is added in the next Spark version. As already mentioned, I have the implementation of the inversion and happy to contribute. > Matrix Inversion on BlockMatrix > --- > > Key: SPARK-23266 > URL: https://issues.apache.org/jira/browse/SPARK-23266 > Project: Spark > Issue Type: New Feature > Components: MLlib >Affects Versions: 2.2.1 >Reporter: Chandan Misra >Priority: Minor > > Matrix inversion is the basic building block for many other algorithms like > regression, classification, geostatistical analysis using ordinary kriging > etc. A simple Spark BlockMatrix based efficient distributed > divide-and-conquer algorithm can be implemented using only *6* > multiplications in each recursion level of the algorithm. The reference paper > can be found in > [https://arxiv.org/abs/1801.04723] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23266) Matrix Inversion on BlockMatrix
[ https://issues.apache.org/jira/browse/SPARK-23266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16381604#comment-16381604 ] Apache Spark commented on SPARK-23266: -- Hello,which version this issue will be release ? > Matrix Inversion on BlockMatrix > --- > > Key: SPARK-23266 > URL: https://issues.apache.org/jira/browse/SPARK-23266 > Project: Spark > Issue Type: New Feature > Components: MLlib >Affects Versions: 2.2.1 >Reporter: Chandan Misra >Priority: Minor > > Matrix inversion is the basic building block for many other algorithms like > regression, classification, geostatistical analysis using ordinary kriging > etc. A simple Spark BlockMatrix based efficient distributed > divide-and-conquer algorithm can be implemented using only *6* > multiplications in each recursion level of the algorithm. The reference paper > can be found in > [https://arxiv.org/abs/1801.04723] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23266) Matrix Inversion on BlockMatrix
[ https://issues.apache.org/jira/browse/SPARK-23266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348310#comment-16348310 ] Chandan Misra commented on SPARK-23266: --- *How big is n typically for your use case?* To give a glimpse of how enormous data is used in Kriging, the following paper might interests you [http://www.tandfonline.com/doi/full/10.1080/2150704X.2016.1275053] The number of points here is 650 million and the size is 18 GB. I think the inversion of variance-covariance matrix C is impossible if it is considered to be processed locally. *I'm also not clear how common this operation is?* Kriging is used extensively in many fields like earth science, mining, weather prediction, wireless sensor networks, remote sensing applications like filling gaps in satellite raster images, creating Digital Elevation Model from LiDAR data to name a few and backed by a large number of research papers. There are separate R packages which are implemented solely for Kriging, like gstat, geoR etc. But these are limited to a single node and fail when a large dataset is fed to the system. Additionally, there have been researches (like [this|https://www.spiedigitallibrary.org/journals/Journal-of-Applied-Remote-Sensing/volume-11/issue-1/016011/High-performance-parallel-approaches-for-three-dimensional-light-detection-and/10.1117/1.JRS.11.016011.short?SSO=1]) going on for parallelizing Kriging in MPI, Hadoop, GPU. One of the teams is [GIST at Oak Ridge national laboratory|http://web.ornl.gov/sci/gist/res_high_performance.shtml], performing geo-computation in HPC setup. I think Spark can easily substitute others for its benefits in this regard. Thus, as a core processing component of Kriging, matrix inversion is highly relevant and a spark implementation will provide a hassle-free solution to a large fraction of the non-computer science researchers. > Matrix Inversion on BlockMatrix > --- > > Key: SPARK-23266 > URL: https://issues.apache.org/jira/browse/SPARK-23266 > Project: Spark > Issue Type: New Feature > Components: MLlib >Affects Versions: 2.2.1 >Reporter: Chandan Misra >Priority: Minor > > Matrix inversion is the basic building block for many other algorithms like > regression, classification, geostatistical analysis using ordinary kriging > etc. A simple Spark BlockMatrix based efficient distributed > divide-and-conquer algorithm can be implemented using only *6* > multiplications in each recursion level of the algorithm. The reference paper > can be found in > [https://arxiv.org/abs/1801.04723] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23266) Matrix Inversion on BlockMatrix
[ https://issues.apache.org/jira/browse/SPARK-23266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346976#comment-16346976 ] Sean Owen commented on SPARK-23266: --- If you're solving a linear system, you don't need or want to invert a matrix, but rather with a decomposition. I get that you're trying to establish the inverse projection for a fixed C though. How big is n typically for your use case? for n in the tens of thousands, it's probably more efficient to work locally. I'm also not clear how common this operation is, but it's plausible. > Matrix Inversion on BlockMatrix > --- > > Key: SPARK-23266 > URL: https://issues.apache.org/jira/browse/SPARK-23266 > Project: Spark > Issue Type: New Feature > Components: MLlib >Affects Versions: 2.2.1 >Reporter: Chandan Misra >Priority: Minor > > Matrix inversion is the basic building block for many other algorithms like > regression, classification, geostatistical analysis using ordinary kriging > etc. A simple Spark BlockMatrix based efficient distributed > divide-and-conquer algorithm can be implemented using only *6* > multiplications in each recursion level of the algorithm. The reference paper > can be found in > [https://arxiv.org/abs/1801.04723] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23266) Matrix Inversion on BlockMatrix
[ https://issues.apache.org/jira/browse/SPARK-23266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346398#comment-16346398 ] Chandan Misra commented on SPARK-23266: --- Kriging is a geostatistical method for interpolating known attribute values of scattered data points to predict unknown attribute values at some other points. It is based on kriging weight calculation which has the equation of the form Cw=D. To get the weight vector, we need one matrix inversion and one matrix-vector multiplication. The size of C (covariance matrix), w vector and D vector are nxn, nx1 and nx1 respectively, where n is the number of interpolating points (input points) and interpolation is done for a single output point. Now, when the output points change, we only require to change matrix D (vector for single point), and thus do not have to do the inverse again and again. We require matrix-vector multiplication on several nodes which is quick i.e. O(n^2) and easy. > Matrix Inversion on BlockMatrix > --- > > Key: SPARK-23266 > URL: https://issues.apache.org/jira/browse/SPARK-23266 > Project: Spark > Issue Type: New Feature > Components: MLlib >Affects Versions: 2.2.1 >Reporter: Chandan Misra >Priority: Minor > > Matrix inversion is the basic building block for many other algorithms like > regression, classification, geostatistical analysis using ordinary kriging > etc. A simple Spark BlockMatrix based efficient distributed > divide-and-conquer algorithm can be implemented using only *6* > multiplications in each recursion level of the algorithm. The reference paper > can be found in > [https://arxiv.org/abs/1801.04723] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23266) Matrix Inversion on BlockMatrix
[ https://issues.apache.org/jira/browse/SPARK-23266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345083#comment-16345083 ] Chandan Misra commented on SPARK-23266: --- I am one of the authors of the above-mentioned paper. I would like to contribute and help in this regard. > Matrix Inversion on BlockMatrix > --- > > Key: SPARK-23266 > URL: https://issues.apache.org/jira/browse/SPARK-23266 > Project: Spark > Issue Type: New Feature > Components: MLlib >Affects Versions: 2.2.1 >Reporter: Chandan Misra >Priority: Critical > > Matrix inversion is the basic building block for many other algorithms like > regression, classification, geostatistical analysis using ordinary kriging > etc. A simple Spark BlockMatrix based efficient distributed > divide-and-conquer algorithm can be implemented using only *6* > multiplications in each recursion level of the algorithm. The reference paper > can be found in > [https://arxiv.org/abs/1801.04723] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org