Hi Etienne, Thanks. The Push Pull system is a way to pull down remote or ancillary files usually *ahead* of file manager ingestion, since the crawler really doesn't have a protocol layer to mitigate remote content. The typical use case if you use Push Pull is:
1. Model remote/ancillary files on other sites 2. Download them with push pull into a "staging area" 3. Crawl and ingest with crawler, as if the content were local to start out with. There is a Push Pull users guide here, it's a bit old but should explain it: http://svn.apache.org/repos/asf/oodt/trunk/pushpull/src/main/resources/docu mentation/ Cheers, Chris ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----Original Message----- From: Etienne Koen <[email protected]> Date: Monday, August 18, 2014 2:36 AM To: Thomas Bennett <[email protected]> Cc: Chris Mattmann <[email protected]>, "[email protected]" <[email protected]>, "[email protected]" <[email protected]> Subject: RE: Remote data transfer >Hi Tomas and all, > >I came across the push/pull tutorial on >https://cwiki.apache.org/confluence/display/OODT/OODT+Push-Pull+User+Guide >. > >Would this guide be more appropriate to download files that have been >archived by the file manager and represent a typical user scenario? > >Regards >Etienne >________________________________________ >From: Thomas Bennett [[email protected]] >Sent: Friday, August 15, 2014 9:54 AM >To: Etienne Koen >Cc: Mattmann, Chris A (3980); [email protected]; [email protected] >Subject: Re: Remote data transfer > >Hi Etienne, > >There are various methods you can use to download the data. > >See this page: >https://cwiki.apache.org/confluence/display/OODT/Getting+products+from+a+r >emote+FileManager > >Recently there is some great work that has been done on using a REST API >- this exists on svn trunk. I don't think it has been released yet. > >https://cwiki.apache.org/confluence/display/OODT/File+Manager+REST+API > >To use these components you will need to deploy tomcat or jetty. > >Shout if you need some help. > >Cheers, >Tom > > > > >On Thu, Aug 14, 2014 at 4:31 PM, Etienne Koen ><[email protected]<mailto:[email protected]>> wrote: >Hi Chris and Tom, > >As I have mentioned before in my previous email, I have managed to ingest >a file to a remote location using the filemgr-client. I am also able to >query the information remotely using for example the query_tool in this >way: > >$ ./query_tool --url http://192.168.0.10:9000 --lucene -query >'CAS.ProductName:blah.txt' > >978ca28e-23b0-11e4-87fb-4f1c29029486 > >What component would I use for searching and downloading the actual >product from the remote file manager? Is the filemgr-client or query_tool >capable of doing this? > >Are there any tutorials you would recommend? > >Thanks >Etienne > >________________________________________ >From: Mattmann, Chris A (3980) >[[email protected]<mailto:[email protected]>] >Sent: Wednesday, August 13, 2014 6:04 PM >To: Etienne Koen; Thomas Bennett >Cc: [email protected]<mailto:[email protected]>; >[email protected]<mailto:[email protected]>; Mattmann, Chris A (3980) >Subject: Re: Remote data transfer > >Thanks guys. > >Etienne, I hope you don't mind but I've copied >[email protected]<mailto:[email protected]> > >on this email. That way you can tap into the entire Apache OODT >community for help. > >The URI has authority component is usually an error indicating >that you have referenced some environment variable in your config >(e.g., filemgr.properties in the etc directory) but that variable >isn't defined. E.g., maybe you have a *.policy.dirs property set >to file://[SOME_UNDEFINED_VARIABLE]/path/dir/ and SOME_UNDEFINED_VARIABLE >is undefined. > >Can you check that to see if that's the root cause of this issue? > >Cheers, >Chris > >------------------------ >Chris Mattmann >[email protected]<mailto:[email protected]> > > > > >-----Original Message----- >From: Etienne Koen <[email protected]<mailto:[email protected]>> >Date: Wednesday, August 13, 2014 1:42 AM >To: Thomas Bennett <[email protected]<mailto:[email protected]>> >Cc: "[email protected]<mailto:[email protected]>" ><[email protected]<mailto:[email protected]>>, Chris Mattmann ><[email protected]<mailto:[email protected]>> >Subject: RE: Remote data transfer > >>Hi Tom, >> >>I get the following error when using the argument: >> >>ERROR: Failed to ingest product 'blah.txt' : URI has an authority >>component >> >>Here both the server and client were using port 9000 >> >>I get this when both the server and client are running on the same port >> >>When communicating on different ports I get: >> >><-- some I/O / HTTP exceptions --> >>... >>... >> >>ERROR: Failed to ingest product 'blah.txt' : Connection refused >> >>Server:9000 and Client:431 >> >>Do you know what any of this mean? >> >>Cheers >>Etienne >> >>________________________________________ >>From: Thomas Bennett [[email protected]<mailto:[email protected]>] >>Sent: Wednesday, August 13, 2014 10:02 AM >>To: Etienne Koen >>Cc: [email protected]<mailto:[email protected]>; >>[email protected]<mailto:[email protected]> >>Subject: Re: Remote data transfer >> >>Hey Etienne, >> >>I've been out of the office the last week but I'm back now. >> >>./filemgr-client --url http://localhost:9000 --operation --ingestProduct >>--productName blah.txt --productStructure Flat --productTypeName >>GenericFile --metadataFile file:///tmp/blah.txt.met --refs >>file:///tmp/blah.txt >> >>How would this line be modified to achieve what I want to do? I see there >>is also an argument --clientTransfer --dataTransfer but I am not sure >>what java class to use for this? >> >>You will need to specify the filemgr remotely ie: --url >>http://192.168.0.1 - are you doing this? >> >>I've done remote file transfer before I'll see if I can remember how to >>do it. >> >>Can I log into the CHPC with the usual credentials? >> >>Cheers, >>Tom >>-- >>Thomas Bennett >> >>SKA South Africa >>Science Processing Team >> >>Office: +27 21 5067341<tel:%2B27%2021%205067341> >>Mobile: +27 79 5237105<tel:%2B27%2079%205237105> >> >>________________________________ >>Disclaimer: This E-mail message, including any attachments, is intended >>only for the person or entity to which it is addressed, and may contain >>confidential information. Each page attached hereto must also be read in >>conjunction with this disclaimer. >>If you are not the intended recipient you are hereby notified that any >>disclosure, copying, distribution or reliance upon the contents of this >>e-mail is strictly prohibited. E.&O.E. >> >>Disclaimer: This E-mail message, including any attachments, is intended >>only for the person or entity to which it is addressed, and may contain >>confidential information. Each page attached hereto must also be read in >>conjunction with this disclaimer. >>If you are not the intended recipient you are hereby notified that any >>disclosure, copying, distribution or reliance upon the contents of this >>e-mail is strictly prohibited. E.&O.E. > > >Disclaimer: This E-mail message, including any attachments, is intended >only for the person or entity to which it is addressed, and may contain >confidential information. Each page attached hereto must also be read in >conjunction with this disclaimer. >If you are not the intended recipient you are hereby notified that any >disclosure, copying, distribution or reliance upon the contents of this >e-mail is strictly prohibited. E.&O.E. > >Disclaimer: This E-mail message, including any attachments, is intended >only for the person or entity to which it is addressed, and may contain >confidential information. Each page attached hereto must also be read in >conjunction with this disclaimer. >If you are not the intended recipient you are hereby notified that any >disclosure, copying, distribution or reliance upon the contents of this >e-mail is strictly prohibited. E.&O.E. > > > >-- >Thomas Bennett > >SKA South Africa >Science Processing Team > >Office: +27 21 5067341 >Mobile: +27 79 5237105 > >________________________________ >Disclaimer: This E-mail message, including any attachments, is intended >only for the person or entity to which it is addressed, and may contain >confidential information. Each page attached hereto must also be read in >conjunction with this disclaimer. >If you are not the intended recipient you are hereby notified that any >disclosure, copying, distribution or reliance upon the contents of this >e-mail is strictly prohibited. E.&O.E. > >Disclaimer: This E-mail message, including any attachments, is intended >only for the person or entity to which it is addressed, and may contain >confidential information. Each page attached hereto must also be read in >conjunction with this disclaimer. >If you are not the intended recipient you are hereby notified that any >disclosure, copying, distribution or reliance upon the contents of this >e-mail is strictly prohibited. E.&O.E.
