Hello everyone,
I guess I found a small bug , so small but really dangerous:

/org/apache/manifoldcf/crawler/connectors/filesystem/FileConnector.java:375

In particular :

else {
                      uri = convertToURI(documentIdentifier);
                      data.addField("uri",file.toString());
                    }

So to the Repository document is added a "uri" field with a value that is
not consistent with the documentURI will be send to the Manifold workflow.

In particular we can find in the RepositoryDocument :

URI : /Users/abenedetti/Documents/experiment/2.pdf

While across Manifold the documentURI that travels across connectors will
be :

DocumentURI : file:/Users/abenedetti/Documents//experiment/2.pdf

This is weird and in my opinion not consistent.

Should not be :

else {
                      uri = convertToURI(documentIdentifier);
                      data.addField("uri",*uri*);
                    }

Cheers
-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Reply via email to