I promise to finish my Pentaho PDI plugins as well at some point, then you can all slurp and transform and ingest from pretty much anywhere.

Tom

On 18/08/14 19:39, Thomas Bennett wrote:
Thanks Chris.

Just to add to the conversation - what protocols are currently supported?

I've seen scp, FTP and http. Also Amazon S3?

On Monday, August 18, 2014, Mattmann, Chris A (3980) <
[email protected]> wrote:

Hi Etienne,

Thanks. The Push Pull system is a way to pull down remote or ancillary
files usually *ahead* of file manager ingestion, since the crawler
really doesn't have a protocol layer to mitigate remote content.
The typical use case if you use Push Pull is:

1. Model remote/ancillary files on other sites
2. Download them with push pull into a "staging area"
3. Crawl and ingest with crawler, as if the content were
local to start out with.

There is a Push Pull users guide here, it's a bit old but should
explain it:

http://svn.apache.org/repos/asf/oodt/trunk/pushpull/src/main/resources/docu
mentation/


Cheers,
Chris


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: [email protected] <javascript:;>
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Etienne Koen <[email protected] <javascript:;>>
Date: Monday, August 18, 2014 2:36 AM
To: Thomas Bennett <[email protected] <javascript:;>>
Cc: Chris Mattmann <[email protected] <javascript:;>>, "
[email protected] <javascript:;>"
<[email protected] <javascript:;>>, "[email protected] <javascript:;>"
<[email protected] <javascript:;>>
Subject: RE: Remote data transfer

Hi Tomas and all,

I came across the push/pull tutorial on

https://cwiki.apache.org/confluence/display/OODT/OODT+Push-Pull+User+Guide
.

Would this guide be more appropriate to download files that have  been
archived by the file manager and represent a typical user scenario?

Regards
Etienne
________________________________________
From: Thomas Bennett [[email protected] <javascript:;>]
Sent: Friday, August 15, 2014 9:54 AM
To: Etienne Koen
Cc: Mattmann, Chris A (3980); [email protected] <javascript:;>;
[email protected] <javascript:;>
Subject: Re: Remote data transfer

Hi Etienne,

There are various methods you can use to download the data.

See this page:

https://cwiki.apache.org/confluence/display/OODT/Getting+products+from+a+r
emote+FileManager

Recently there is some great work that has been done on using a REST API
- this exists on svn trunk. I don't think it has been released yet.

https://cwiki.apache.org/confluence/display/OODT/File+Manager+REST+API

To use these components you will need to deploy tomcat or jetty.

Shout if you need some help.

Cheers,
Tom




On Thu, Aug 14, 2014 at 4:31 PM, Etienne Koen
<[email protected] <javascript:;><mailto:[email protected]
<javascript:;>>> wrote:
Hi Chris and Tom,

As I have mentioned before in my previous email, I have managed to ingest
a file to a remote location using the filemgr-client. I am also able to
query the information remotely using for example the query_tool in this
way:

$ ./query_tool --url http://192.168.0.10:9000 --lucene -query
'CAS.ProductName:blah.txt'

978ca28e-23b0-11e4-87fb-4f1c29029486

What component would I use for searching and downloading the actual
product from the remote file manager? Is the filemgr-client or query_tool
capable of doing this?

Are there any tutorials you would recommend?

Thanks
Etienne

________________________________________
From: Mattmann, Chris A (3980)
[[email protected] <javascript:;><mailto:
[email protected] <javascript:;>>]
Sent: Wednesday, August 13, 2014 6:04 PM
To: Etienne Koen; Thomas Bennett
Cc: [email protected] <javascript:;><mailto:[email protected]
<javascript:;>>;
[email protected] <javascript:;><mailto:[email protected]
<javascript:;>>; Mattmann, Chris A (3980)
Subject: Re: Remote data transfer

Thanks guys.

Etienne, I hope you don't mind but I've copied
[email protected] <javascript:;><mailto:[email protected]
<javascript:;>>
on this email. That way you can tap into the entire Apache OODT
community for help.

The URI has authority component is usually an error indicating
that you have referenced some environment variable in your config
(e.g., filemgr.properties in the etc directory) but that variable
isn't defined. E.g., maybe you have a *.policy.dirs property set
to file://[SOME_UNDEFINED_VARIABLE]/path/dir/ and SOME_UNDEFINED_VARIABLE
is undefined.

Can you check that to see if that's the root cause of this issue?

Cheers,
Chris

------------------------
Chris Mattmann
[email protected] <javascript:;><mailto:[email protected]
<javascript:;>>



-----Original Message-----
From: Etienne Koen <[email protected] <javascript:;><mailto:
[email protected] <javascript:;>>>
Date: Wednesday, August 13, 2014 1:42 AM
To: Thomas Bennett <[email protected] <javascript:;><mailto:
[email protected] <javascript:;>>>
Cc: "[email protected] <javascript:;><mailto:[email protected]
<javascript:;>>"
<[email protected] <javascript:;><mailto:[email protected]
<javascript:;>>>, Chris Mattmann
<[email protected] <javascript:;><mailto:[email protected]
<javascript:;>>>
Subject: RE: Remote data transfer

Hi Tom,

I get the following error when using the argument:

ERROR: Failed to ingest product 'blah.txt' : URI has an authority
component

Here both the server and client were using port 9000

I get this when both the server and client are running on the same port

When communicating on different ports I get:

<-- some I/O / HTTP exceptions -->
...
...

ERROR: Failed to ingest product 'blah.txt' : Connection refused

Server:9000 and Client:431

Do you know what any of this mean?

Cheers
Etienne

________________________________________
From: Thomas Bennett [[email protected] <javascript:;><mailto:
[email protected] <javascript:;>>]
Sent: Wednesday, August 13, 2014 10:02 AM
To: Etienne Koen
Cc: [email protected] <javascript:;><mailto:[email protected]
<javascript:;>>;
[email protected] <javascript:;><mailto:[email protected]
<javascript:;>>
Subject: Re: Remote data transfer

Hey Etienne,

I've been out of the office the last week but I'm back now.

./filemgr-client --url http://localhost:9000 --operation --ingestProduct
--productName blah.txt --productStructure Flat --productTypeName
GenericFile --metadataFile file:///tmp/blah.txt.met --refs
file:///tmp/blah.txt

How would this line be modified to achieve what I want to do? I see there
is also an argument --clientTransfer --dataTransfer but I am not sure
what java class to use for this?

You will need to specify the filemgr remotely ie: --url
http://192.168.0.1 - are you doing this?

I've done remote file transfer before I'll see if I can remember how to
do it.

Can I log into the CHPC with the usual credentials?

Cheers,
Tom
--
Thomas Bennett

SKA South Africa
Science Processing Team

Office: +27 21 5067341<tel:%2B27%2021%205067341>
Mobile: +27 79 5237105<tel:%2B27%2079%205237105>

________________________________
Disclaimer: This E-mail message, including any attachments, is intended
only for the person or entity to which it is addressed, and may contain
confidential information. Each page attached hereto must also be read in
conjunction with this disclaimer.
If you are not the intended recipient you are hereby notified that any
disclosure, copying, distribution or reliance upon the contents of this
e-mail is strictly prohibited. E.&O.E.

Disclaimer: This E-mail message, including any attachments, is intended
only for the  person or entity to which it is addressed, and may contain
confidential  information. Each page attached hereto must also be read in
conjunction with this disclaimer.
If you are not the intended recipient you are hereby notified that any
disclosure, copying, distribution or reliance upon the contents of this
e-mail is strictly prohibited.    E.&O.E.

Disclaimer: This E-mail message, including any attachments, is intended
only for the  person or entity to which it is addressed, and may contain
confidential  information. Each page attached hereto must also be read in
conjunction with this disclaimer.
If you are not the intended recipient you are hereby notified that any
disclosure, copying, distribution or reliance upon the contents of this
e-mail is strictly prohibited.    E.&O.E.

Disclaimer: This E-mail message, including any attachments, is intended
only for the  person or entity to which it is addressed, and may contain
confidential  information. Each page attached hereto must also be read in
conjunction with this disclaimer.
If you are not the intended recipient you are hereby notified that any
disclosure, copying, distribution or reliance upon the contents of this
e-mail is strictly prohibited.    E.&O.E.



--
Thomas Bennett

SKA South Africa
Science Processing Team

Office: +27 21 5067341
Mobile: +27 79 5237105

________________________________
Disclaimer: This E-mail message, including any attachments, is intended
only for the person or entity to which it is addressed, and may contain
confidential information. Each page attached hereto must also be read in
conjunction with this disclaimer.
If you are not the intended recipient you are hereby notified that any
disclosure, copying, distribution or reliance upon the contents of this
e-mail is strictly prohibited. E.&O.E.

Disclaimer: This E-mail message, including any attachments, is intended
only for the  person or entity to which it is addressed, and may contain
confidential  information. Each page attached hereto must also be read in
conjunction with this disclaimer.
If you are not the intended recipient you are hereby notified that any
disclosure, copying, distribution or reliance upon the contents of this
e-mail is strictly prohibited.    E.&O.E.



--
*Tom Barber* | Technical Director

meteorite bi
*T:* +44 20 8133 3730
*W:* www.meteorite.bi | *Skype:* meteorite.consulting
*A:* Surrey Technology Centre, Surrey Research Park, Guildford, GU2 7YG, UK

Reply via email to