Hello everyone,
I guess I found a small bug , so small but really dangerous:
/org/apache/manifoldcf/crawler/connectors/filesystem/FileConnector.java:375
In particular :
else {
uri = convertToURI(documentIdentifier);
data.addField("uri",file.toString());
}
So to the Repository document is added a "uri" field with a value that is
not consistent with the documentURI will be send to the Manifold workflow.
In particular we can find in the RepositoryDocument :
URI : /Users/abenedetti/Documents/experiment/2.pdf
While across Manifold the documentURI that travels across connectors will
be :
DocumentURI : file:/Users/abenedetti/Documents//experiment/2.pdf
This is weird and in my opinion not consistent.
Should not be :
else {
uri = convertToURI(documentIdentifier);
data.addField("uri",*uri*);
}
Cheers
--
--------------------------
Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti
"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"
William Blake - Songs of Experience -1794 England