Hey Chintu,
From: <Mistry>, "Chintu [COLUMBUS TECHNOLOGIES AND SERVICES INC] (GSFC-586.0)"
<[email protected]<mailto:[email protected]>>
Date: Tuesday, December 11, 2012 2:41 PM
To: jpluser
<[email protected]<mailto:[email protected]>>,
"[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Subject: Re: OODT 0.3 branch
Answers inline below.
---snip
Gotcha, so you are using different product types. So, each crawler is crawling
various product types in each one of the staging area dirs, that looks like
e.g.,
/STAGING_AREA_BASE
/dir1 – 1st crawler
- file1 of product type 1
- file2 of product type 3
/dir2 – 2nd crawler
- file3 of product type 3
/dir3 – 3rd crawler
- file4 of product type 2
Is that what the staging area looks like? - YES
And then your FM is ingesting all 3 product types (I just picked 3 arbitrarily
could have been N) into:
ARCHIVE_BASE/{ProductTypeName}/{YYYYMMDD}
Correct? - YES
If so, I would imagine if FM1 and FM2 and FM3 would actually speed up the
ingestion process compared to just using 1 FM with 1, or 2 or 3 crawlers all
talking to it.
Let me ask a few more questions:
Do you see e.g., in the above example that file4 is ingested before file2? What
about file3 before file2? If not, there is something wiggy going on.
- I have not checked that. I guess I can check that. Can FM handle
multiple connections at the same time ?
Yep FM can handle multiple connections at one time up to a limit (I think hard
defaulted to ~100-200 by the underlying XMLRPC 2.1 library). We're using an old
library currently but have a goal to upgrade to the latest version where I
think this # is configurable.
Cheers,
Chris