Hi Etienne,

Thanks. The Push Pull system is a way to pull down remote or ancillary
files usually *ahead* of file manager ingestion, since the crawler
really doesn't have a protocol layer to mitigate remote content.
The typical use case if you use Push Pull is:

1. Model remote/ancillary files on other sites
2. Download them with push pull into a "staging area"
3. Crawl and ingest with crawler, as if the content were
local to start out with.

There is a Push Pull users guide here, it's a bit old but should
explain it:

http://svn.apache.org/repos/asf/oodt/trunk/pushpull/src/main/resources/docu
mentation/


Cheers,
Chris


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: [email protected]
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Etienne Koen <[email protected]>
Date: Monday, August 18, 2014 2:36 AM
To: Thomas Bennett <[email protected]>
Cc: Chris Mattmann <[email protected]>, "[email protected]"
<[email protected]>, "[email protected]" <[email protected]>
Subject: RE: Remote data transfer

>Hi Tomas and all,
>
>I came across the push/pull tutorial on
>https://cwiki.apache.org/confluence/display/OODT/OODT+Push-Pull+User+Guide
>.
>
>Would this guide be more appropriate to download files that have  been
>archived by the file manager and represent a typical user scenario?
>
>Regards
>Etienne
>________________________________________
>From: Thomas Bennett [[email protected]]
>Sent: Friday, August 15, 2014 9:54 AM
>To: Etienne Koen
>Cc: Mattmann, Chris A (3980); [email protected]; [email protected]
>Subject: Re: Remote data transfer
>
>Hi Etienne,
>
>There are various methods you can use to download the data.
>
>See this page:
>https://cwiki.apache.org/confluence/display/OODT/Getting+products+from+a+r
>emote+FileManager
>
>Recently there is some great work that has been done on using a REST API
>- this exists on svn trunk. I don't think it has been released yet.
>
>https://cwiki.apache.org/confluence/display/OODT/File+Manager+REST+API
>
>To use these components you will need to deploy tomcat or jetty.
>
>Shout if you need some help.
>
>Cheers,
>Tom
>
>
>
>
>On Thu, Aug 14, 2014 at 4:31 PM, Etienne Koen
><[email protected]<mailto:[email protected]>> wrote:
>Hi Chris and Tom,
>
>As I have mentioned before in my previous email, I have managed to ingest
>a file to a remote location using the filemgr-client. I am also able to
>query the information remotely using for example the query_tool in this
>way:
>
>$ ./query_tool --url http://192.168.0.10:9000 --lucene -query
>'CAS.ProductName:blah.txt'
>
>978ca28e-23b0-11e4-87fb-4f1c29029486
>
>What component would I use for searching and downloading the actual
>product from the remote file manager? Is the filemgr-client or query_tool
>capable of doing this?
>
>Are there any tutorials you would recommend?
>
>Thanks
>Etienne
>
>________________________________________
>From: Mattmann, Chris A (3980)
>[[email protected]<mailto:[email protected]>]
>Sent: Wednesday, August 13, 2014 6:04 PM
>To: Etienne Koen; Thomas Bennett
>Cc: [email protected]<mailto:[email protected]>;
>[email protected]<mailto:[email protected]>; Mattmann, Chris A (3980)
>Subject: Re: Remote data transfer
>
>Thanks guys.
>
>Etienne, I hope you don't mind but I've copied
>[email protected]<mailto:[email protected]>
>
>on this email. That way you can tap into the entire Apache OODT
>community for help.
>
>The URI has authority component is usually an error indicating
>that you have referenced some environment variable in your config
>(e.g., filemgr.properties in the etc directory) but that variable
>isn't defined. E.g., maybe you have a *.policy.dirs property set
>to file://[SOME_UNDEFINED_VARIABLE]/path/dir/ and SOME_UNDEFINED_VARIABLE
>is undefined.
>
>Can you check that to see if that's the root cause of this issue?
>
>Cheers,
>Chris
>
>------------------------
>Chris Mattmann
>[email protected]<mailto:[email protected]>
>
>
>
>
>-----Original Message-----
>From: Etienne Koen <[email protected]<mailto:[email protected]>>
>Date: Wednesday, August 13, 2014 1:42 AM
>To: Thomas Bennett <[email protected]<mailto:[email protected]>>
>Cc: "[email protected]<mailto:[email protected]>"
><[email protected]<mailto:[email protected]>>, Chris Mattmann
><[email protected]<mailto:[email protected]>>
>Subject: RE: Remote data transfer
>
>>Hi Tom,
>>
>>I get the following error when using the argument:
>>
>>ERROR: Failed to ingest product 'blah.txt' : URI has an authority
>>component
>>
>>Here both the server and client were using port 9000
>>
>>I get this when both the server and client are running on the same port
>>
>>When communicating on different ports I get:
>>
>><-- some I/O / HTTP exceptions -->
>>...
>>...
>>
>>ERROR: Failed to ingest product 'blah.txt' : Connection refused
>>
>>Server:9000 and Client:431
>>
>>Do you know what any of this mean?
>>
>>Cheers
>>Etienne
>>
>>________________________________________
>>From: Thomas Bennett [[email protected]<mailto:[email protected]>]
>>Sent: Wednesday, August 13, 2014 10:02 AM
>>To: Etienne Koen
>>Cc: [email protected]<mailto:[email protected]>;
>>[email protected]<mailto:[email protected]>
>>Subject: Re: Remote data transfer
>>
>>Hey Etienne,
>>
>>I've been out of the office the last week but I'm back now.
>>
>>./filemgr-client --url http://localhost:9000 --operation --ingestProduct
>>--productName blah.txt --productStructure Flat --productTypeName
>>GenericFile --metadataFile file:///tmp/blah.txt.met --refs
>>file:///tmp/blah.txt
>>
>>How would this line be modified to achieve what I want to do? I see there
>>is also an argument --clientTransfer --dataTransfer but I am not sure
>>what java class to use for this?
>>
>>You will need to specify the filemgr remotely ie: --url
>>http://192.168.0.1 - are you doing this?
>>
>>I've done remote file transfer before I'll see if I can remember how to
>>do it.
>>
>>Can I log into the CHPC with the usual credentials?
>>
>>Cheers,
>>Tom
>>--
>>Thomas Bennett
>>
>>SKA South Africa
>>Science Processing Team
>>
>>Office: +27 21 5067341<tel:%2B27%2021%205067341>
>>Mobile: +27 79 5237105<tel:%2B27%2079%205237105>
>>
>>________________________________
>>Disclaimer: This E-mail message, including any attachments, is intended
>>only for the person or entity to which it is addressed, and may contain
>>confidential information. Each page attached hereto must also be read in
>>conjunction with this disclaimer.
>>If you are not the intended recipient you are hereby notified that any
>>disclosure, copying, distribution or reliance upon the contents of this
>>e-mail is strictly prohibited. E.&O.E.
>>
>>Disclaimer: This E-mail message, including any attachments, is intended
>>only for the  person or entity to which it is addressed, and may contain
>>confidential  information. Each page attached hereto must also be read in
>>conjunction with this disclaimer.
>>If you are not the intended recipient you are hereby notified that any
>>disclosure, copying, distribution or reliance upon the contents of this
>>e-mail is strictly prohibited.    E.&O.E.
>
>
>Disclaimer: This E-mail message, including any attachments, is intended
>only for the  person or entity to which it is addressed, and may contain
>confidential  information. Each page attached hereto must also be read in
>conjunction with this disclaimer.
>If you are not the intended recipient you are hereby notified that any
>disclosure, copying, distribution or reliance upon the contents of this
>e-mail is strictly prohibited.    E.&O.E.
>
>Disclaimer: This E-mail message, including any attachments, is intended
>only for the  person or entity to which it is addressed, and may contain
>confidential  information. Each page attached hereto must also be read in
>conjunction with this disclaimer.
>If you are not the intended recipient you are hereby notified that any
>disclosure, copying, distribution or reliance upon the contents of this
>e-mail is strictly prohibited.    E.&O.E.
>
>
>
>--
>Thomas Bennett
>
>SKA South Africa
>Science Processing Team
>
>Office: +27 21 5067341
>Mobile: +27 79 5237105
>
>________________________________
>Disclaimer: This E-mail message, including any attachments, is intended
>only for the person or entity to which it is addressed, and may contain
>confidential information. Each page attached hereto must also be read in
>conjunction with this disclaimer.
>If you are not the intended recipient you are hereby notified that any
>disclosure, copying, distribution or reliance upon the contents of this
>e-mail is strictly prohibited. E.&O.E.
>
>Disclaimer: This E-mail message, including any attachments, is intended
>only for the  person or entity to which it is addressed, and may contain
>confidential  information. Each page attached hereto must also be read in
>conjunction with this disclaimer.
>If you are not the intended recipient you are hereby notified that any
>disclosure, copying, distribution or reliance upon the contents of this
>e-mail is strictly prohibited.    E.&O.E.

Reply via email to