Hi Tom, Great question!
By default, all of the protocols support here in the cas-protocol module of Apache OODT: http://svn.apache.org/repos/asf/oodt/trunk/protocol/ * ftp * http(s) * imaps * sftp Note that there is an Amazon S3 "data transfer" module in the File Manager, but not explicitly in Push Pull. It would be hopefully not too difficult (and a welcomed patch!) to incorporate this functionality into the cas-protcool layer. There are also these specific plugins PushPull plugins: https://cwiki.apache.org/confluence/display/OODT/OODT+Push+Pull+Plugins Note the Push Pull plugins in the wiki page above leverage LGPL libraries and I wasn't able to find a replacement for them. We aren't officially "recommending" them as Apache OODT PMC members, but they are useful FTP plugins if you can't get the existing protocol-ftp plugin to work. You knowingly however do so by explicitly downloading these plugins and building them into your OODT push pull installation. I would love if someone were to find ALv2 compatible versions of the above plugins so we could manage them in our code base but hasn't be done yet. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----Original Message----- From: Thomas Bennett <[email protected]> Reply-To: "[email protected]" <[email protected]> Date: Monday, August 18, 2014 11:39 AM To: "[email protected]" <[email protected]> Cc: Etienne Koen <[email protected]>, Thomas Bennett <[email protected]>, "[email protected]" <[email protected]> Subject: Re: Remote data transfer >Thanks Chris. > >Just to add to the conversation - what protocols are currently supported? > >I've seen scp, FTP and http. Also Amazon S3? > >On Monday, August 18, 2014, Mattmann, Chris A (3980) < >[email protected]> wrote: > >> Hi Etienne, >> >> Thanks. The Push Pull system is a way to pull down remote or ancillary >> files usually *ahead* of file manager ingestion, since the crawler >> really doesn't have a protocol layer to mitigate remote content. >> The typical use case if you use Push Pull is: >> >> 1. Model remote/ancillary files on other sites >> 2. Download them with push pull into a "staging area" >> 3. Crawl and ingest with crawler, as if the content were >> local to start out with. >> >> There is a Push Pull users guide here, it's a bit old but should >> explain it: >> >> >>http://svn.apache.org/repos/asf/oodt/trunk/pushpull/src/main/resources/do >>cu >> mentation/ >> >> >> Cheers, >> Chris >> >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Chris Mattmann, Ph.D. >> Chief Architect >> Instrument Software and Science Data Systems Section (398) >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >> Office: 168-519, Mailstop: 168-527 >> Email: [email protected] <javascript:;> >> WWW: http://sunset.usc.edu/~mattmann/ >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Adjunct Associate Professor, Computer Science Department >> University of Southern California, Los Angeles, CA 90089 USA >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> >> >> >> >> >> >> -----Original Message----- >> From: Etienne Koen <[email protected] <javascript:;>> >> Date: Monday, August 18, 2014 2:36 AM >> To: Thomas Bennett <[email protected] <javascript:;>> >> Cc: Chris Mattmann <[email protected] <javascript:;>>, " >> [email protected] <javascript:;>" >> <[email protected] <javascript:;>>, "[email protected] >><javascript:;>" >> <[email protected] <javascript:;>> >> Subject: RE: Remote data transfer >> >> >Hi Tomas and all, >> > >> >I came across the push/pull tutorial on >> > >> >>https://cwiki.apache.org/confluence/display/OODT/OODT+Push-Pull+User+Guid >>e >> >. >> > >> >Would this guide be more appropriate to download files that have been >> >archived by the file manager and represent a typical user scenario? >> > >> >Regards >> >Etienne >> >________________________________________ >> >From: Thomas Bennett [[email protected] <javascript:;>] >> >Sent: Friday, August 15, 2014 9:54 AM >> >To: Etienne Koen >> >Cc: Mattmann, Chris A (3980); [email protected] <javascript:;>; >> [email protected] <javascript:;> >> >Subject: Re: Remote data transfer >> > >> >Hi Etienne, >> > >> >There are various methods you can use to download the data. >> > >> >See this page: >> > >> >>https://cwiki.apache.org/confluence/display/OODT/Getting+products+from+a+ >>r >> >emote+FileManager >> > >> >Recently there is some great work that has been done on using a REST >>API >> >- this exists on svn trunk. I don't think it has been released yet. >> > >> >https://cwiki.apache.org/confluence/display/OODT/File+Manager+REST+API >> > >> >To use these components you will need to deploy tomcat or jetty. >> > >> >Shout if you need some help. >> > >> >Cheers, >> >Tom >> > >> > >> > >> > >> >On Thu, Aug 14, 2014 at 4:31 PM, Etienne Koen >> ><[email protected] <javascript:;><mailto:[email protected] >> <javascript:;>>> wrote: >> >Hi Chris and Tom, >> > >> >As I have mentioned before in my previous email, I have managed to >>ingest >> >a file to a remote location using the filemgr-client. I am also able to >> >query the information remotely using for example the query_tool in this >> >way: >> > >> >$ ./query_tool --url http://192.168.0.10:9000 --lucene -query >> >'CAS.ProductName:blah.txt' >> > >> >978ca28e-23b0-11e4-87fb-4f1c29029486 >> > >> >What component would I use for searching and downloading the actual >> >product from the remote file manager? Is the filemgr-client or >>query_tool >> >capable of doing this? >> > >> >Are there any tutorials you would recommend? >> > >> >Thanks >> >Etienne >> > >> >________________________________________ >> >From: Mattmann, Chris A (3980) >> >[[email protected] <javascript:;><mailto: >> [email protected] <javascript:;>>] >> >Sent: Wednesday, August 13, 2014 6:04 PM >> >To: Etienne Koen; Thomas Bennett >> >Cc: [email protected] <javascript:;><mailto:[email protected] >> <javascript:;>>; >> >[email protected] <javascript:;><mailto:[email protected] >> <javascript:;>>; Mattmann, Chris A (3980) >> >Subject: Re: Remote data transfer >> > >> >Thanks guys. >> > >> >Etienne, I hope you don't mind but I've copied >> >[email protected] <javascript:;><mailto:[email protected] >> <javascript:;>> >> > >> >on this email. That way you can tap into the entire Apache OODT >> >community for help. >> > >> >The URI has authority component is usually an error indicating >> >that you have referenced some environment variable in your config >> >(e.g., filemgr.properties in the etc directory) but that variable >> >isn't defined. E.g., maybe you have a *.policy.dirs property set >> >to file://[SOME_UNDEFINED_VARIABLE]/path/dir/ and >>SOME_UNDEFINED_VARIABLE >> >is undefined. >> > >> >Can you check that to see if that's the root cause of this issue? >> > >> >Cheers, >> >Chris >> > >> >------------------------ >> >Chris Mattmann >> >[email protected] <javascript:;><mailto:[email protected] >> <javascript:;>> >> > >> > >> > >> > >> >-----Original Message----- >> >From: Etienne Koen <[email protected] <javascript:;><mailto: >> [email protected] <javascript:;>>> >> >Date: Wednesday, August 13, 2014 1:42 AM >> >To: Thomas Bennett <[email protected] <javascript:;><mailto: >> [email protected] <javascript:;>>> >> >Cc: "[email protected] <javascript:;><mailto:[email protected] >> <javascript:;>>" >> ><[email protected] <javascript:;><mailto:[email protected] >> <javascript:;>>>, Chris Mattmann >> ><[email protected] >><javascript:;><mailto:[email protected] >> <javascript:;>>> >> >Subject: RE: Remote data transfer >> > >> >>Hi Tom, >> >> >> >>I get the following error when using the argument: >> >> >> >>ERROR: Failed to ingest product 'blah.txt' : URI has an authority >> >>component >> >> >> >>Here both the server and client were using port 9000 >> >> >> >>I get this when both the server and client are running on the same >>port >> >> >> >>When communicating on different ports I get: >> >> >> >><-- some I/O / HTTP exceptions --> >> >>... >> >>... >> >> >> >>ERROR: Failed to ingest product 'blah.txt' : Connection refused >> >> >> >>Server:9000 and Client:431 >> >> >> >>Do you know what any of this mean? >> >> >> >>Cheers >> >>Etienne >> >> >> >>________________________________________ >> >>From: Thomas Bennett [[email protected] <javascript:;><mailto: >> [email protected] <javascript:;>>] >> >>Sent: Wednesday, August 13, 2014 10:02 AM >> >>To: Etienne Koen >> >>Cc: [email protected] <javascript:;><mailto:[email protected] >> <javascript:;>>; >> >>[email protected] >><javascript:;><mailto:[email protected] >> <javascript:;>> >> >>Subject: Re: Remote data transfer >> >> >> >>Hey Etienne, >> >> >> >>I've been out of the office the last week but I'm back now. >> >> >> >>./filemgr-client --url http://localhost:9000 --operation >>--ingestProduct >> >>--productName blah.txt --productStructure Flat --productTypeName >> >>GenericFile --metadataFile file:///tmp/blah.txt.met --refs >> >>file:///tmp/blah.txt >> >> >> >>How would this line be modified to achieve what I want to do? I see >>there >> >>is also an argument --clientTransfer --dataTransfer but I am not sure >> >>what java class to use for this? >> >> >> >>You will need to specify the filemgr remotely ie: --url >> >>http://192.168.0.1 - are you doing this? >> >> >> >>I've done remote file transfer before I'll see if I can remember how >>to >> >>do it. >> >> >> >>Can I log into the CHPC with the usual credentials? >> >> >> >>Cheers, >> >>Tom >> >>-- >> >>Thomas Bennett >> >> >> >>SKA South Africa >> >>Science Processing Team >> >> >> >>Office: +27 21 5067341<tel:%2B27%2021%205067341> >> >>Mobile: +27 79 5237105<tel:%2B27%2079%205237105> >> >> >> >>________________________________ >> >>Disclaimer: This E-mail message, including any attachments, is >>intended >> >>only for the person or entity to which it is addressed, and may >>contain >> >>confidential information. Each page attached hereto must also be read >>in >> >>conjunction with this disclaimer. >> >>If you are not the intended recipient you are hereby notified that any >> >>disclosure, copying, distribution or reliance upon the contents of >>this >> >>e-mail is strictly prohibited. E.&O.E. >> >> >> >>Disclaimer: This E-mail message, including any attachments, is >>intended >> >>only for the person or entity to which it is addressed, and may >>contain >> >>confidential information. Each page attached hereto must also be >>read in >> >>conjunction with this disclaimer. >> >>If you are not the intended recipient you are hereby notified that any >> >>disclosure, copying, distribution or reliance upon the contents of >>this >> >>e-mail is strictly prohibited. E.&O.E. >> > >> > >> >Disclaimer: This E-mail message, including any attachments, is intended >> >only for the person or entity to which it is addressed, and may >>contain >> >confidential information. Each page attached hereto must also be read >>in >> >conjunction with this disclaimer. >> >If you are not the intended recipient you are hereby notified that any >> >disclosure, copying, distribution or reliance upon the contents of this >> >e-mail is strictly prohibited. E.&O.E. >> > >> >Disclaimer: This E-mail message, including any attachments, is intended >> >only for the person or entity to which it is addressed, and may >>contain >> >confidential information. Each page attached hereto must also be read >>in >> >conjunction with this disclaimer. >> >If you are not the intended recipient you are hereby notified that any >> >disclosure, copying, distribution or reliance upon the contents of this >> >e-mail is strictly prohibited. E.&O.E. >> > >> > >> > >> >-- >> >Thomas Bennett >> > >> >SKA South Africa >> >Science Processing Team >> > >> >Office: +27 21 5067341 >> >Mobile: +27 79 5237105 >> > >> >________________________________ >> >Disclaimer: This E-mail message, including any attachments, is intended >> >only for the person or entity to which it is addressed, and may contain >> >confidential information. Each page attached hereto must also be read >>in >> >conjunction with this disclaimer. >> >If you are not the intended recipient you are hereby notified that any >> >disclosure, copying, distribution or reliance upon the contents of this >> >e-mail is strictly prohibited. E.&O.E. >> > >> >Disclaimer: This E-mail message, including any attachments, is intended >> >only for the person or entity to which it is addressed, and may >>contain >> >confidential information. Each page attached hereto must also be read >>in >> >conjunction with this disclaimer. >> >If you are not the intended recipient you are hereby notified that any >> >disclosure, copying, distribution or reliance upon the contents of this >> >e-mail is strictly prohibited. E.&O.E. >> >>
