[ 
https://issues.apache.org/jira/browse/CONNECTORS-710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13681055#comment-13681055
 ] 

Karl Wright edited comment on CONNECTORS-710 at 6/12/13 9:05 AM:
-----------------------------------------------------------------

Hi Koji-san and Osuka-san,

I apologize; I did not understand the use case you were trying to address. I 
agree it is a valid use case.

I have some observations.  First, it seems to me that the interpretation of the 
file path as a potential WGET URL will occur only for specific start points.  
So it might be a good idea to add the checkbox proposed on the Paths tab, and 
have one per start point.  For the document specification, this would mean just 
adding a new attribute to every "startpoint" node.  The UI is a bit trickier, 
but I can help if that turns out to be a challenge - please let me know.  I 
would consider changing the UI to use the table paradigm for start points 
instead of the current description/value paradigm - see the web connector 
Bandwidth or Access Credentials tabs for examples, here: 
http://manifoldcf.apache.org/release/trunk/en_US/end-user-documentation.html#webrepository

Second, since the manner in which the URL is generated will affect how File 
Connector data is indexed, this information (really, just a boolean) does need 
to be included in the version string for every document.  Ideally, it would be 
put into the version string in the getDocumentVersions() method, and pulled 
from the version string in processDocuments() so that it can be used to inform 
the decision of how to create the right URL.  This keeps what is indexed and 
what is versioned consistent.

If you have any questions about this proposal, please don't hesitate to ask, 
and I apologize once more for not understanding the original idea more quickly. 
;-)

                
      was (Author: [email protected]):
    Hi Koji-san and Osuka-san,

I apologize; I did not understand the use case you were trying to address. I 
agree it is a valid use case.

I have some observations.  First, it seems to me that the interpretation of the 
file path as a potential WGET URL will occur only for specific start points.  
So it might be a good idea to add the checkbox proposed on the Paths tab, and 
have one per start point.  For the document specification, this would mean just 
adding a new attribute to every "startpoint" node.  The UI is a bit trickier, 
but I can help if that turns out to be a challenge.  I would consider changing 
the UI to use the table paradigm for start points instead of the current 
description/value paradigm - see the web connector Bandwidth or Access 
Credentials tabs for examples, here: 
http://manifoldcf.apache.org/release/trunk/en_US/end-user-documentation.html#webrepository

Second, since the manner in which the URL is generated will affect how File 
Connector data is indexed, this information (really, just a boolean) does need 
to be included in the version string for every document.  Ideally, it would be 
put into the version string in the getDocumentVersions() method, and pulled 
from the version string in processDocuments() so that it can be used to inform 
the decision of how to create the right URL.  This keeps what is indexed and 
what is versioned consistent.

If you have any questions about this proposal, please don't hesitate to ask, 
and I apologize once more for not understanding the original idea more quickly. 
;-)

                  
> FileConnector should have option of outputting a full http url, not just a 
> file:/ url
> -------------------------------------------------------------------------------------
>
>                 Key: CONNECTORS-710
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-710
>             Project: ManifoldCF
>          Issue Type: Improvement
>          Components: File system connector
>    Affects Versions: ManifoldCF 1.3
>            Reporter: Minoru Osuka
>             Fix For: ManifoldCF 1.3
>
>         Attachments: Screen Shot 2013-06-11 at 5.46.55 PM.png
>
>
> I would like to enhance that FileConnector be able to convert from file path 
> to URI.
> FileOutputConnector will output the file path like Wget.
> $OUTPUT_PATH/http/localhost:8345/mcf-crawler-ui/showjobstatus.jsp
> I would like to enhance that FileConector be able to put documentIdentifere 
> like WebConnector.
> Current FileConnector can output id following,
> {code:xml}
> <str 
> name="id">file:/Users/minoru/tmp/out/http/localhost:8345/mcf-crawler-ui/showjobstatus.jsp</str>
> {code}
> Enhanced FileConnector can output id following,
> {code:xml}
> <str 
> name="id">file:/Users/minoru/tmp/out/http/localhost:8345/mcf-crawler-ui/showjobstatus.jsp</str>
> {code}
> or
> {code:xml}<str 
> name="id">http://localhost:8345/mcf-crawler-ui/showjobstatus.jsp</str></str>
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to