[
https://issues.apache.org/jira/browse/MAHOUT-34?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588271#action_12588271
]
Karl Wettin commented on MAHOUT-34:
-----------------------------------
bq. I'm also thinking we should perhaps re-use the Element instance in
iterator.next?
bq. Yes, this is specially true when we go through the codes of DenseVector.
But the case I came up was this:
{code}
Element sum;
boolean first=true;
for(Element e : vec) {
if(first) {
sum=e;
first=false;
}else sum.set(e.get()+sum.get());
}
{code}
bq. In the 'else' part above, if we reuse elements, both sum and e are actually
referring to the same object, and if vector size is more than 2, sum will
always just hold 2*lastelement rather than the intended value. These kind of
errors might be harder to detect in more complicated cases.
If this turns out to be a big performance issue people we just need to tell
people that this is the way it is and that code should instead be implemented
like this:
{code}
int firstIndex = -1;
float sum = 0f;
for(Element e : vec) {
if(firstIndex == -1) {
firstIndex = e.index();
}
sum += e.get();
}
vec.set(firstIndex, sum);
{code}
If people have a hard time with that we could implement the iterator as an ad
hoc class, similar to TermEnum and TermDocs classes of Lucene to really point
out what's going on. But that would cripple the nice and compact Iterable code.
> Iterator interface for Vectors
> ------------------------------
>
> Key: MAHOUT-34
> URL: https://issues.apache.org/jira/browse/MAHOUT-34
> Project: Mahout
> Issue Type: New Feature
> Reporter: Samee Zahur
> Assignee: Karl Wettin
> Attachments: VectorIterator.3.patch.bz2,
> VectorIterator.patch.2.tar.bz2, VectorIterator.patch.tar.bz2
>
>
> Implemented an Iterator interface for the Vector classes. Was necessary for
> porting from Float[] used in some parts of the code.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.