[
https://issues.apache.org/jira/browse/CONNECTORS-710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13681055#comment-13681055
]
Karl Wright edited comment on CONNECTORS-710 at 6/12/13 9:05 AM:
-----------------------------------------------------------------
Hi Koji-san and Osuka-san,
I apologize; I did not understand the use case you were trying to address. I
agree it is a valid use case.
I have some observations. First, it seems to me that the interpretation of the
file path as a potential WGET URL will occur only for specific start points.
So it might be a good idea to add the checkbox proposed on the Paths tab, and
have one per start point. For the document specification, this would mean just
adding a new attribute to every "startpoint" node. The UI is a bit trickier,
but I can help if that turns out to be a challenge - please let me know. I
would consider changing the UI to use the table paradigm for start points
instead of the current description/value paradigm - see the web connector
Bandwidth or Access Credentials tabs for examples, here:
http://manifoldcf.apache.org/release/trunk/en_US/end-user-documentation.html#webrepository
Second, since the manner in which the URL is generated will affect how File
Connector data is indexed, this information (really, just a boolean) does need
to be included in the version string for every document. Ideally, it would be
put into the version string in the getDocumentVersions() method, and pulled
from the version string in processDocuments() so that it can be used to inform
the decision of how to create the right URL. This keeps what is indexed and
what is versioned consistent.
If you have any questions about this proposal, please don't hesitate to ask,
and I apologize once more for not understanding the original idea more quickly.
;-)
was (Author: [email protected]):
Hi Koji-san and Osuka-san,
I apologize; I did not understand the use case you were trying to address. I
agree it is a valid use case.
I have some observations. First, it seems to me that the interpretation of the
file path as a potential WGET URL will occur only for specific start points.
So it might be a good idea to add the checkbox proposed on the Paths tab, and
have one per start point. For the document specification, this would mean just
adding a new attribute to every "startpoint" node. The UI is a bit trickier,
but I can help if that turns out to be a challenge. I would consider changing
the UI to use the table paradigm for start points instead of the current
description/value paradigm - see the web connector Bandwidth or Access
Credentials tabs for examples, here:
http://manifoldcf.apache.org/release/trunk/en_US/end-user-documentation.html#webrepository
Second, since the manner in which the URL is generated will affect how File
Connector data is indexed, this information (really, just a boolean) does need
to be included in the version string for every document. Ideally, it would be
put into the version string in the getDocumentVersions() method, and pulled
from the version string in processDocuments() so that it can be used to inform
the decision of how to create the right URL. This keeps what is indexed and
what is versioned consistent.
If you have any questions about this proposal, please don't hesitate to ask,
and I apologize once more for not understanding the original idea more quickly.
;-)
> FileConnector should have option of outputting a full http url, not just a
> file:/ url
> -------------------------------------------------------------------------------------
>
> Key: CONNECTORS-710
> URL: https://issues.apache.org/jira/browse/CONNECTORS-710
> Project: ManifoldCF
> Issue Type: Improvement
> Components: File system connector
> Affects Versions: ManifoldCF 1.3
> Reporter: Minoru Osuka
> Fix For: ManifoldCF 1.3
>
> Attachments: Screen Shot 2013-06-11 at 5.46.55 PM.png
>
>
> I would like to enhance that FileConnector be able to convert from file path
> to URI.
> FileOutputConnector will output the file path like Wget.
> $OUTPUT_PATH/http/localhost:8345/mcf-crawler-ui/showjobstatus.jsp
> I would like to enhance that FileConector be able to put documentIdentifere
> like WebConnector.
> Current FileConnector can output id following,
> {code:xml}
> <str
> name="id">file:/Users/minoru/tmp/out/http/localhost:8345/mcf-crawler-ui/showjobstatus.jsp</str>
> {code}
> Enhanced FileConnector can output id following,
> {code:xml}
> <str
> name="id">file:/Users/minoru/tmp/out/http/localhost:8345/mcf-crawler-ui/showjobstatus.jsp</str>
> {code}
> or
> {code:xml}<str
> name="id">http://localhost:8345/mcf-crawler-ui/showjobstatus.jsp</str></str>
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira