BOOM, thanks Tom. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
-----Original Message----- From: Tom Barber <[email protected]> Reply-To: "[email protected]" <[email protected]> Date: Monday, August 18, 2014 1:21 PM To: "[email protected]" <[email protected]> Subject: Re: Remote data transfer >I promise to finish my Pentaho PDI plugins as well at some point, then >you can all slurp and transform and ingest from pretty much anywhere. > >Tom > >On 18/08/14 19:39, Thomas Bennett wrote: >> Thanks Chris. >> >> Just to add to the conversation - what protocols are currently >>supported? >> >> I've seen scp, FTP and http. Also Amazon S3? >> >> On Monday, August 18, 2014, Mattmann, Chris A (3980) < >> [email protected]> wrote: >> >>> Hi Etienne, >>> >>> Thanks. The Push Pull system is a way to pull down remote or ancillary >>> files usually *ahead* of file manager ingestion, since the crawler >>> really doesn't have a protocol layer to mitigate remote content. >>> The typical use case if you use Push Pull is: >>> >>> 1. Model remote/ancillary files on other sites >>> 2. Download them with push pull into a "staging area" >>> 3. Crawl and ingest with crawler, as if the content were >>> local to start out with. >>> >>> There is a Push Pull users guide here, it's a bit old but should >>> explain it: >>> >>> >>>http://svn.apache.org/repos/asf/oodt/trunk/pushpull/src/main/resources/d >>>ocu >>> mentation/ >>> >>> >>> Cheers, >>> Chris >>> >>> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> Chris Mattmann, Ph.D. >>> Chief Architect >>> Instrument Software and Science Data Systems Section (398) >>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >>> Office: 168-519, Mailstop: 168-527 >>> Email: [email protected] <javascript:;> >>> WWW: http://sunset.usc.edu/~mattmann/ >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> Adjunct Associate Professor, Computer Science Department >>> University of Southern California, Los Angeles, CA 90089 USA >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> >>> >>> >>> >>> >>> >>> -----Original Message----- >>> From: Etienne Koen <[email protected] <javascript:;>> >>> Date: Monday, August 18, 2014 2:36 AM >>> To: Thomas Bennett <[email protected] <javascript:;>> >>> Cc: Chris Mattmann <[email protected] <javascript:;>>, " >>> [email protected] <javascript:;>" >>> <[email protected] <javascript:;>>, "[email protected] >>><javascript:;>" >>> <[email protected] <javascript:;>> >>> Subject: RE: Remote data transfer >>> >>>> Hi Tomas and all, >>>> >>>> I came across the push/pull tutorial on >>>> >>> >>>https://cwiki.apache.org/confluence/display/OODT/OODT+Push-Pull+User+Gui >>>de >>>> . >>>> >>>> Would this guide be more appropriate to download files that have been >>>> archived by the file manager and represent a typical user scenario? >>>> >>>> Regards >>>> Etienne >>>> ________________________________________ >>>> From: Thomas Bennett [[email protected] <javascript:;>] >>>> Sent: Friday, August 15, 2014 9:54 AM >>>> To: Etienne Koen >>>> Cc: Mattmann, Chris A (3980); [email protected] <javascript:;>; >>> [email protected] <javascript:;> >>>> Subject: Re: Remote data transfer >>>> >>>> Hi Etienne, >>>> >>>> There are various methods you can use to download the data. >>>> >>>> See this page: >>>> >>> >>>https://cwiki.apache.org/confluence/display/OODT/Getting+products+from+a >>>+r >>>> emote+FileManager >>>> >>>> Recently there is some great work that has been done on using a REST >>>>API >>>> - this exists on svn trunk. I don't think it has been released yet. >>>> >>>> https://cwiki.apache.org/confluence/display/OODT/File+Manager+REST+API >>>> >>>> To use these components you will need to deploy tomcat or jetty. >>>> >>>> Shout if you need some help. >>>> >>>> Cheers, >>>> Tom >>>> >>>> >>>> >>>> >>>> On Thu, Aug 14, 2014 at 4:31 PM, Etienne Koen >>>> <[email protected] <javascript:;><mailto:[email protected] >>> <javascript:;>>> wrote: >>>> Hi Chris and Tom, >>>> >>>> As I have mentioned before in my previous email, I have managed to >>>>ingest >>>> a file to a remote location using the filemgr-client. I am also able >>>>to >>>> query the information remotely using for example the query_tool in >>>>this >>>> way: >>>> >>>> $ ./query_tool --url http://192.168.0.10:9000 --lucene -query >>>> 'CAS.ProductName:blah.txt' >>>> >>>> 978ca28e-23b0-11e4-87fb-4f1c29029486 >>>> >>>> What component would I use for searching and downloading the actual >>>> product from the remote file manager? Is the filemgr-client or >>>>query_tool >>>> capable of doing this? >>>> >>>> Are there any tutorials you would recommend? >>>> >>>> Thanks >>>> Etienne >>>> >>>> ________________________________________ >>>> From: Mattmann, Chris A (3980) >>>> [[email protected] <javascript:;><mailto: >>> [email protected] <javascript:;>>] >>>> Sent: Wednesday, August 13, 2014 6:04 PM >>>> To: Etienne Koen; Thomas Bennett >>>> Cc: [email protected] <javascript:;><mailto:[email protected] >>> <javascript:;>>; >>>> [email protected] <javascript:;><mailto:[email protected] >>> <javascript:;>>; Mattmann, Chris A (3980) >>>> Subject: Re: Remote data transfer >>>> >>>> Thanks guys. >>>> >>>> Etienne, I hope you don't mind but I've copied >>>> [email protected] <javascript:;><mailto:[email protected] >>> <javascript:;>> >>>> on this email. That way you can tap into the entire Apache OODT >>>> community for help. >>>> >>>> The URI has authority component is usually an error indicating >>>> that you have referenced some environment variable in your config >>>> (e.g., filemgr.properties in the etc directory) but that variable >>>> isn't defined. E.g., maybe you have a *.policy.dirs property set >>>> to file://[SOME_UNDEFINED_VARIABLE]/path/dir/ and >>>>SOME_UNDEFINED_VARIABLE >>>> is undefined. >>>> >>>> Can you check that to see if that's the root cause of this issue? >>>> >>>> Cheers, >>>> Chris >>>> >>>> ------------------------ >>>> Chris Mattmann >>>> [email protected] >>>><javascript:;><mailto:[email protected] >>> <javascript:;>> >>>> >>>> >>>> >>>> -----Original Message----- >>>> From: Etienne Koen <[email protected] <javascript:;><mailto: >>> [email protected] <javascript:;>>> >>>> Date: Wednesday, August 13, 2014 1:42 AM >>>> To: Thomas Bennett <[email protected] <javascript:;><mailto: >>> [email protected] <javascript:;>>> >>>> Cc: "[email protected] <javascript:;><mailto:[email protected] >>> <javascript:;>>" >>>> <[email protected] <javascript:;><mailto:[email protected] >>> <javascript:;>>>, Chris Mattmann >>>> <[email protected] >>>><javascript:;><mailto:[email protected] >>> <javascript:;>>> >>>> Subject: RE: Remote data transfer >>>> >>>>> Hi Tom, >>>>> >>>>> I get the following error when using the argument: >>>>> >>>>> ERROR: Failed to ingest product 'blah.txt' : URI has an authority >>>>> component >>>>> >>>>> Here both the server and client were using port 9000 >>>>> >>>>> I get this when both the server and client are running on the same >>>>>port >>>>> >>>>> When communicating on different ports I get: >>>>> >>>>> <-- some I/O / HTTP exceptions --> >>>>> ... >>>>> ... >>>>> >>>>> ERROR: Failed to ingest product 'blah.txt' : Connection refused >>>>> >>>>> Server:9000 and Client:431 >>>>> >>>>> Do you know what any of this mean? >>>>> >>>>> Cheers >>>>> Etienne >>>>> >>>>> ________________________________________ >>>>> From: Thomas Bennett [[email protected] <javascript:;><mailto: >>> [email protected] <javascript:;>>] >>>>> Sent: Wednesday, August 13, 2014 10:02 AM >>>>> To: Etienne Koen >>>>> Cc: [email protected] <javascript:;><mailto:[email protected] >>> <javascript:;>>; >>>>> [email protected] >>>>><javascript:;><mailto:[email protected] >>> <javascript:;>> >>>>> Subject: Re: Remote data transfer >>>>> >>>>> Hey Etienne, >>>>> >>>>> I've been out of the office the last week but I'm back now. >>>>> >>>>> ./filemgr-client --url http://localhost:9000 --operation >>>>>--ingestProduct >>>>> --productName blah.txt --productStructure Flat --productTypeName >>>>> GenericFile --metadataFile file:///tmp/blah.txt.met --refs >>>>> file:///tmp/blah.txt >>>>> >>>>> How would this line be modified to achieve what I want to do? I see >>>>>there >>>>> is also an argument --clientTransfer --dataTransfer but I am not sure >>>>> what java class to use for this? >>>>> >>>>> You will need to specify the filemgr remotely ie: --url >>>>> http://192.168.0.1 - are you doing this? >>>>> >>>>> I've done remote file transfer before I'll see if I can remember how >>>>>to >>>>> do it. >>>>> >>>>> Can I log into the CHPC with the usual credentials? >>>>> >>>>> Cheers, >>>>> Tom >>>>> -- >>>>> Thomas Bennett >>>>> >>>>> SKA South Africa >>>>> Science Processing Team >>>>> >>>>> Office: +27 21 5067341<tel:%2B27%2021%205067341> >>>>> Mobile: +27 79 5237105<tel:%2B27%2079%205237105> >>>>> >>>>> ________________________________ >>>>> Disclaimer: This E-mail message, including any attachments, is >>>>>intended >>>>> only for the person or entity to which it is addressed, and may >>>>>contain >>>>> confidential information. Each page attached hereto must also be >>>>>read in >>>>> conjunction with this disclaimer. >>>>> If you are not the intended recipient you are hereby notified that >>>>>any >>>>> disclosure, copying, distribution or reliance upon the contents of >>>>>this >>>>> e-mail is strictly prohibited. E.&O.E. >>>>> >>>>> Disclaimer: This E-mail message, including any attachments, is >>>>>intended >>>>> only for the person or entity to which it is addressed, and may >>>>>contain >>>>> confidential information. Each page attached hereto must also be >>>>>read in >>>>> conjunction with this disclaimer. >>>>> If you are not the intended recipient you are hereby notified that >>>>>any >>>>> disclosure, copying, distribution or reliance upon the contents of >>>>>this >>>>> e-mail is strictly prohibited. E.&O.E. >>>> >>>> Disclaimer: This E-mail message, including any attachments, is >>>>intended >>>> only for the person or entity to which it is addressed, and may >>>>contain >>>> confidential information. Each page attached hereto must also be >>>>read in >>>> conjunction with this disclaimer. >>>> If you are not the intended recipient you are hereby notified that any >>>> disclosure, copying, distribution or reliance upon the contents of >>>>this >>>> e-mail is strictly prohibited. E.&O.E. >>>> >>>> Disclaimer: This E-mail message, including any attachments, is >>>>intended >>>> only for the person or entity to which it is addressed, and may >>>>contain >>>> confidential information. Each page attached hereto must also be >>>>read in >>>> conjunction with this disclaimer. >>>> If you are not the intended recipient you are hereby notified that any >>>> disclosure, copying, distribution or reliance upon the contents of >>>>this >>>> e-mail is strictly prohibited. E.&O.E. >>>> >>>> >>>> >>>> -- >>>> Thomas Bennett >>>> >>>> SKA South Africa >>>> Science Processing Team >>>> >>>> Office: +27 21 5067341 >>>> Mobile: +27 79 5237105 >>>> >>>> ________________________________ >>>> Disclaimer: This E-mail message, including any attachments, is >>>>intended >>>> only for the person or entity to which it is addressed, and may >>>>contain >>>> confidential information. Each page attached hereto must also be read >>>>in >>>> conjunction with this disclaimer. >>>> If you are not the intended recipient you are hereby notified that any >>>> disclosure, copying, distribution or reliance upon the contents of >>>>this >>>> e-mail is strictly prohibited. E.&O.E. >>>> >>>> Disclaimer: This E-mail message, including any attachments, is >>>>intended >>>> only for the person or entity to which it is addressed, and may >>>>contain >>>> confidential information. Each page attached hereto must also be >>>>read in >>>> conjunction with this disclaimer. >>>> If you are not the intended recipient you are hereby notified that any >>>> disclosure, copying, distribution or reliance upon the contents of >>>>this >>>> e-mail is strictly prohibited. E.&O.E. >>> > > >-- >*Tom Barber* | Technical Director > >meteorite bi >*T:* +44 20 8133 3730 >*W:* www.meteorite.bi | *Skype:* meteorite.consulting >*A:* Surrey Technology Centre, Surrey Research Park, Guildford, GU2 7YG, >UK
