[
https://issues.apache.org/jira/browse/MAHOUT-379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12858807#action_12858807
]
Sean Owen commented on MAHOUT-379:
----------------------------------
I'd like to commit this patch as it addresses a couple issues, but it's a big
one. Deserves some looking if you have a moment. We can retroactively undo some
elements later, but best to have a glance now.
> SequentialAccessSparseVector.equals does not agree with
> AbstractVector.equivalent
> ---------------------------------------------------------------------------------
>
> Key: MAHOUT-379
> URL: https://issues.apache.org/jira/browse/MAHOUT-379
> Project: Mahout
> Issue Type: Bug
> Components: Math
> Affects Versions: 0.4
> Reporter: Danny Leshem
> Assignee: Sean Owen
> Priority: Minor
> Fix For: 0.3
>
> Attachments: MAHOUT-379.patch, MAHOUT-379.patch, MAHOUT-379.patch
>
>
> When a SequentialAccessSparseVector is serialized and deserialized using
> VectorWritable, the result vector and the original vector are equivalent, yet
> equals returns false.
> The following unit-test reproduces the problem:
> {code}
> @Test
> public void testSequentialAccessSparseVectorEquals() throws Exception {
> final Vector v = new SequentialAccessSparseVector(1);
> final VectorWritable vectorWritable = new VectorWritable(v);
> final VectorWritable vectorWritable2 = new VectorWritable();
> writeAndRead(vectorWritable, vectorWritable2);
> final Vector v2 = vectorWritable2.get();
> assertTrue(AbstractVector.equivalent(v, v2));
> assertEquals(v, v2); // This line fails!
> }
> private void writeAndRead(Writable toWrite, Writable toRead) throws
> IOException {
> final ByteArrayOutputStream baos = new ByteArrayOutputStream();
> final DataOutputStream dos = new DataOutputStream(baos);
> toWrite.write(dos);
> final ByteArrayInputStream bais = new
> ByteArrayInputStream(baos.toByteArray());
> final DataInputStream dis = new DataInputStream(bais);
> toRead.readFields(dis);
> }
> {code}
> The problem seems to be that the original vector name is null, while the new
> vector's name is an empty string. The same issue probably also happens with
> RandomAccessSparseVector.
> SequentialAccessSparseVectorWritable (line 40):
> {code}
> dataOutput.writeUTF(getName() == null ? "" : getName());
> {code}
> RandomAccessSparseVectorWritable (line 42):
> {code}
> dataOutput.writeUTF(this.getName() == null ? "" : this.getName());
> {code}
> The simplest fix is probably to change the default Vector's name from null to
> the empty string.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.