[ 
https://issues.apache.org/jira/browse/MAHOUT-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14973508#comment-14973508
 ] 

Dmitriy Lyubimov commented on MAHOUT-1781:
------------------------------------------

{code:title=test}
  test("dot-view performance") {

    val dv1 = new DenseVector(500) := Matrices.uniformView(1, 500, 1234)(0, ::)
    val dv2 = new DenseVector(500) := Matrices.uniformView(1, 500, 1244)(0, ::)

    val nit = 300000

    // warm up
    dv1 dot dv2

    val dmsStart = System.currentTimeMillis()
    for (i ← 0 until nit)
      dv1 dot dv2
    val dmsMs = System.currentTimeMillis() - dmsStart

    val (dvv1, dvv2) = dv1(0 until dv1.length) → dv2(0 until dv2.length)

    // Warm up.
    dvv1 dot dvv2

    val dvmsStart = System.currentTimeMillis()
    for (i ← 0 until nit)
      dvv1 dot dvv2
    val dvmsMs = System.currentTimeMillis() - dvmsStart

    debug(f"dense vector dots:${dmsMs}%.2f ms.")
    debug(f"dense view dots:${dvmsMs}%.2f ms.")

  }

{code}

{panel:title=output}
Testing started at 4:49 PM ...
0 [ScalaTest-run-running-RLikeVectorOpsSuite] DEBUG 
org.apache.mahout.math.scalabindings.RLikeVectorOpsSuite  - dense vector 
dots:43.00 ms.
1 [ScalaTest-run-running-RLikeVectorOpsSuite] DEBUG 
org.apache.mahout.math.scalabindings.RLikeVectorOpsSuite  - dense view 
dots:301.00 ms.
{panel}

So far all investigation shows that even though algorithm selected by 
VectorBinaryAggregate is logically identical to what dense vector does in this 
situation, it appears all the difference is due to cost of cost estimate (about 
20% of excess) and the cost of iteration (~ 80% of excess). 

Cost of iteration for cost-estimated algorithm is mostly due to additional 
levels of indirection in function calls for every element processed. Whereas 
dense vector accesses array elements directly. 

Rewriting the simplest kernel possible for dense view dot dense view will 
improve the situation but I don't see how to fix this elegantly. A special 
DenseVectorView implementation just for the dot??





> Dense matrix view multiplication is 4x slower than non-view one
> ---------------------------------------------------------------
>
>                 Key: MAHOUT-1781
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1781
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.11.0
>            Reporter: Dmitriy Lyubimov
>            Assignee: Dmitriy Lyubimov
>            Priority: Critical
>             Fix For: 0.12.0, 0.13.0
>
>
> if mxA and mxB are two in-core DenseMatrix matrices, then 
> mxA(::,::) %*% mxB(::,::) takes 4x the time of mxA %*% mxB.
> possibly an issue of dot products on VectorViews vs. DenseVectors.
> dot product over DenseVectors seems to not to go through aggregate() 
> cost-optimized framework. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to