Hi All,
I am a newbie here, so please be patient if I ask dumb questions. I need some
help troubleshooting an error on ingest. I am trying to work my way through the
process of setting up a small system to do some basic crawling and ingesting.
I have science files, meta data files, and engineering files that are dropped
off into a subdirectory of my staging directory where the name of the
subdirectory is based on the file type combined with the UTC year and day of
year of when the files are received.
For example: All of the engineering data files (*.ecsv, *.sfdu) are dropped
off in:
staging/ops/eng/[year]/[day of year]
And, all of the raw science data files and their associated meta data files
(*.out, *.dtl, *.lbl) are dropped off in in:
staging/ops/sci/[year]/[day of year]
Each of these new mime types have been added in filemgr/etc/mime-types.xml
The engineering files require only simple handling, so I am starting with those
first. I am using the AutoDetectProductCrawler. I have defined a ProductType of
"EngineeringFile" in filemgr/policy/oodt/product-types.xml. I added the
"EngineeringFile" (which is just a copy of the GenericFile product type for
now) to filemgr/policy/oodt/product-type-element-map.xml, and gave it the same
elements the "GenericFile" product type has. I added the engineering files
mime types to extensions/policy/mime-extractor-map.xml, and I wrote a simple
ExternExtractor in perl that creates a .met file for each engineering file in
the directory, sets the same keys/values that are set in the example met file
"blah.txt.met", and returns a 0 upon success.
I made my own copy of the oodt script and added statements to start
crawler_launcher for me. When I run the script all of the processes come up and
run, the directory is crawled, the extern extractor is called and the met files
are created, and the ingestion begins. Yay! This in itself was a big
accomplishment. But, the ingestion is failing, and I can't figure out why. Can
anyone give me ideas on how to troubleshoot this? The error log is below. It
shows everything is successful up until to the ingest. And my crawler is
invoked in like this:
exec "$CRAWLER_HOME"/bin/"$CRAWLER_EXEC" \
--operation --launchAutoCrawler \
--productPath "$OODT_HOME"/data/staging/ops/eng \
--filemgrUrl http://localhost:9000 \
--clientTransferer
org.apache.oodt.cas.filemgr.datatransfer.LocalDataTransferFactory \
--mimeExtractorRepo
"$OODT_HOME"/extensions/policy/mime-extractor-map.xml
Thanks very much.
Valerie
Using OODT_BASE: /homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy
Using OODT_HOME: /homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy
Using OODT_TMPDIR:
/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/temp
Using JRE_HOME: /project/jedi/users/jedi-pipeline/jdk1.7.0_55
Using CLASSPATH:
started filemgr
PID file
(/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/filemgr/run/cas.filemgr.pid)
found. Is File Manager still running? Start aborted.
Setting property 'StdProductCrawler.clientTransferer'
Setting property 'MetExtractorProductCrawler.clientTransferer'
Setting property 'AutoDetectProductCrawler.clientTransferer'
Setting property 'StdProductCrawler.filemgrUrl'
Setting property 'MetExtractorProductCrawler.filemgrUrl'
Setting property 'AutoDetectProductCrawler.filemgrUrl'
Setting property 'StdProductCrawler.productPath'
Setting property 'MetExtractorProductCrawler.productPath'
Setting property 'AutoDetectProductCrawler.productPath'
Setting property 'AutoDetectProductCrawler.mimeExtractorRepo'
Oct 02, 2014 12:14:36 PM
org.springframework.beans.factory.config.PropertyOverrideConfigurer processKey
FINE: Property 'StdProductCrawler.productPath' set to value
[/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/data/staging/ops/eng]
Oct 02, 2014 12:14:36 PM
org.springframework.beans.factory.config.PropertyOverrideConfigurer processKey
FINE: Property 'AutoDetectProductCrawler.mimeExtractorRepo' set to value
[/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/extensions/policy/mime-extractor-map.xml]
Oct 02, 2014 12:14:36 PM
org.springframework.beans.factory.config.PropertyOverrideConfigurer processKey
FINE: Property 'MetExtractorProductCrawler.clientTransferer' set to value
[org.apache.oodt.cas.filemgr.datatransfer.LocalDataTransferFactory]
Oct 02, 2014 12:14:36 PM
org.springframework.beans.factory.config.PropertyOverrideConfigurer processKey
FINE: Property 'AutoDetectProductCrawler.filemgrUrl' set to value
[http://localhost:9000]
Oct 02, 2014 12:14:36 PM
org.springframework.beans.factory.config.PropertyOverrideConfigurer processKey
FINE: Property 'AutoDetectProductCrawler.clientTransferer' set to value
[org.apache.oodt.cas.filemgr.datatransfer.LocalDataTransferFactory]
Oct 02, 2014 12:14:36 PM
org.springframework.beans.factory.config.PropertyOverrideConfigurer processKey
FINE: Property 'AutoDetectProductCrawler.productPath' set to value
[/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/data/staging/ops/eng]
Oct 02, 2014 12:14:36 PM
org.springframework.beans.factory.config.PropertyOverrideConfigurer processKey
FINE: Property 'StdProductCrawler.filemgrUrl' set to value
[http://localhost:9000]
Oct 02, 2014 12:14:36 PM
org.springframework.beans.factory.config.PropertyOverrideConfigurer processKey
FINE: Property 'MetExtractorProductCrawler.filemgrUrl' set to value
[http://localhost:9000]
Oct 02, 2014 12:14:36 PM
org.springframework.beans.factory.config.PropertyOverrideConfigurer processKey
FINE: Property 'MetExtractorProductCrawler.productPath' set to value
[/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/data/staging/ops/eng]
Oct 02, 2014 12:14:36 PM
org.springframework.beans.factory.config.PropertyOverrideConfigurer processKey
FINE: Property 'StdProductCrawler.clientTransferer' set to value
[org.apache.oodt.cas.filemgr.datatransfer.LocalDataTransferFactory]
Oct 02, 2014 12:14:36 PM org.apache.oodt.cas.crawl.ProductCrawler crawl
INFO: Crawling
/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/data/staging/ops/eng
Oct 02, 2014 12:14:36 PM org.apache.oodt.cas.crawl.ProductCrawler crawl
INFO: Crawling
/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/data/staging/ops/eng/2014
Oct 02, 2014 12:14:36 PM org.apache.oodt.cas.crawl.ProductCrawler crawl
INFO: Crawling
/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/data/staging/ops/eng/2014/002
Oct 02, 2014 12:14:36 PM org.apache.oodt.cas.crawl.ProductCrawler handleFile
INFO: Handling file
/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/data/staging/ops/eng/2014/002/JEDI_2014002183121000_2014002190215999.sfdu
Oct 02, 2014 12:14:36 PM
org.apache.oodt.cas.metadata.preconditions.PreCondEvalUtils eval
INFO: Passed precondition comparator id CheckThatDataFileSizeIsGreaterThanZero
Oct 02, 2014 12:14:36 PM
org.apache.oodt.cas.metadata.extractors.ExternMetExtractor extrMetadata
INFO: Generating met file for product file:
[/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/data/staging/ops/eng/2014/002/JEDI_2014002183121000_2014002190215999.sfdu]
Oct 02, 2014 12:14:36 PM
org.apache.oodt.cas.metadata.extractors.ExternMetExtractor extrMetadata
INFO: Executing command line:
[/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/extensions/extractors/jediEngineeringFileExtractor.pl
/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/data/staging/ops/eng/2014/002/JEDI_2014002183121000_2014002190215999.sfdu
] with workingDir:
[/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/data/staging/ops/eng/2014/002]
to extract metadata
OUTPUT: JEDI: EngineeringFile MET Extractor: processing file:
[JEDI_2014002183121000_2014002190215999.sfdu]
OUTPUT: JEDI_2014002183121000_2014002190215999.sfdu is a sfdu engineering data
file
Oct 02, 2014 12:14:36 PM
org.apache.oodt.cas.metadata.extractors.ExternMetExtractor extrMetadata
INFO: Met extraction successful for product file:
[/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/data/staging/ops/eng/2014/002/JEDI_2014002183121000_2014002190215999.sfdu]
Oct 02, 2014 12:14:36 PM org.apache.oodt.cas.crawl.ProductCrawler ingest
INFO: ProductCrawler: Ready to ingest product:
[/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/data/staging/ops/eng/2014/002/JEDI_2014002183121000_2014002190215999.sfdu]:
ProductType: [EngineeringFile]
Oct 02, 2014 12:14:36 PM org.apache.oodt.cas.filemgr.ingest.StdIngester
setFileManager
INFO: StdIngester: connected to file manager: [http://localhost:9000]
Oct 02, 2014 12:14:36 PM
org.apache.oodt.cas.filemgr.datatransfer.LocalDataTransferer setFileManagerUrl
INFO: Local Data Transfer to: [http://localhost:9000] enabled
Oct 02, 2014 12:14:36 PM org.apache.oodt.cas.filemgr.ingest.StdIngester ingest
INFO: StdIngester: ingesting product: ProductName:
[JEDI_2014002183121000_2014002190215999.sfdu]: ProductType: [EngineeringFile]:
FileLocation:
[/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/data/staging/ops/eng/2014/002/]
org.apache.xmlrpc.XmlRpcException: java.lang.Exception:
org.apache.oodt.cas.filemgr.structs.exceptions.CatalogException: Error
ingesting product [org.apache.oodt.cas.filemgr.structs.Product@fb60123] : null
at
org.apache.xmlrpc.XmlRpcClientResponseProcessor.decodeException(XmlRpcClientResponseProcessor.java:104)
at
org.apache.xmlrpc.XmlRpcClientResponseProcessor.decodeResponse(XmlRpcClientResponseProcessor.java:71)
at org.apache.xmlrpc.XmlRpcClientWorker.execute(XmlRpcClientWorker.java:73)
at org.apache.xmlrpc.XmlRpcClient.execute(XmlRpcClient.java:194)
at org.apache.xmlrpc.XmlRpcClient.execute(XmlRpcClient.java:185)
at org.apache.xmlrpc.XmlRpcClient.execute(XmlRpcClient.java:178)
at
org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient.ingestProduct(XmlRpcFileManagerClient.java:1198)
at
org.apache.oodt.cas.filemgr.ingest.StdIngester.ingest(StdIngester.java:199)
at org.apache.oodt.cas.crawl.ProductCrawler.ingest(ProductCrawler.java:304)
at
org.apache.oodt.cas.crawl.ProductCrawler.handleFile(ProductCrawler.java:188)
at org.apache.oodt.cas.crawl.ProductCrawler.crawl(ProductCrawler.java:108)
at org.apache.oodt.cas.crawl.ProductCrawler.crawl(ProductCrawler.java:75)
at
org.apache.oodt.cas.crawl.cli.action.CrawlerLauncherCliAction.execute(CrawlerLauncherCliAction.java:58)
at org.apache.oodt.cas.cli.CmdLineUtility.execute(CmdLineUtility.java:331)
at org.apache.oodt.cas.cli.CmdLineUtility.run(CmdLineUtility.java:187)
at org.apache.oodt.cas.crawl.CrawlerLauncher.main(CrawlerLauncher.java:36)
Oct 02, 2014 12:14:36 PM
org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient ingestProduct
SEVERE: Failed to ingest product
[org.apache.oodt.cas.filemgr.structs.Product@156b7537] : java.lang.Exception:
org.apache.oodt.cas.filemgr.structs.exceptions.CatalogException: Error
ingesting product [org.apache.oodt.cas.filemgr.structs.Product@fb60123] : null
-- rolling back ingest
Oct 02, 2014 12:14:36 PM
org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient ingestProduct
SEVERE: Failed to rollback ingest of product
[org.apache.oodt.cas.filemgr.structs.Product@156b7537] : java.lang.Exception:
org.apache.oodt.cas.filemgr.structs.exceptions.CatalogException: Error
ingesting product [org.apache.oodt.cas.filemgr.structs.Product@fb60123] : null
java.lang.Exception: Failed to ingest product
[org.apache.oodt.cas.filemgr.structs.Product@156b7537] : java.lang.Exception:
org.apache.oodt.cas.filemgr.structs.exceptions.CatalogException: Error
ingesting product [org.apache.oodt.cas.filemgr.structs.Product@fb60123] : null
at
org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient.ingestProduct(XmlRpcFileManagerClient.java:1303)
at
org.apache.oodt.cas.filemgr.ingest.StdIngester.ingest(StdIngester.java:199)
at org.apache.oodt.cas.crawl.ProductCrawler.ingest(ProductCrawler.java:304)
at
org.apache.oodt.cas.crawl.ProductCrawler.handleFile(ProductCrawler.java:188)
at org.apache.oodt.cas.crawl.ProductCrawler.crawl(ProductCrawler.java:108)
at org.apache.oodt.cas.crawl.ProductCrawler.crawl(ProductCrawler.java:75)
at
org.apache.oodt.cas.crawl.cli.action.CrawlerLauncherCliAction.execute(CrawlerLauncherCliAction.java:58)
at org.apache.oodt.cas.cli.CmdLineUtility.execute(CmdLineUtility.java:331)
at org.apache.oodt.cas.cli.CmdLineUtility.run(CmdLineUtility.java:187)
at org.apache.oodt.cas.crawl.CrawlerLauncher.main(CrawlerLauncher.java:36)
Oct 02, 2014 12:14:36 PM org.apache.oodt.cas.filemgr.ingest.StdIngester ingest
WARNING: exception ingesting product:
[JEDI_2014002183121000_2014002190215999.sfdu]: Message: Failed to ingest
product [org.apache.oodt.cas.filemgr.structs.Product@156b7537] :
java.lang.Exception:
org.apache.oodt.cas.filemgr.structs.exceptions.CatalogException: Error
ingesting product [org.apache.oodt.cas.filemgr.structs.Product@fb60123] : null
Oct 02, 2014 12:14:36 PM org.apache.oodt.cas.crawl.ProductCrawler ingest
WARNING: ProductCrawler: Exception ingesting product:
[/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/data/staging/ops/eng/2014/002/JEDI_2014002183121000_2014002190215999.sfdu]:
Message: exception ingesting product:
[JEDI_2014002183121000_2014002190215999.sfdu]: Message: Failed to ingest
product [org.apache.oodt.cas.filemgr.structs.Product@156b7537] :
java.lang.Exception:
org.apache.oodt.cas.filemgr.structs.exceptions.CatalogException: Error
ingesting product [org.apache.oodt.cas.filemgr.structs.Product@fb60123] : null:
attempting to continue crawling
org.apache.oodt.cas.filemgr.structs.exceptions.IngestException: exception
ingesting product: [JEDI_2014002183121000_2014002190215999.sfdu]: Message:
Failed to ingest product [org.apache.oodt.cas.filemgr.structs.Product@156b7537]
: java.lang.Exception:
org.apache.oodt.cas.filemgr.structs.exceptions.CatalogException: Error
ingesting product [org.apache.oodt.cas.filemgr.structs.Product@fb60123] : null
at
org.apache.oodt.cas.filemgr.ingest.StdIngester.ingest(StdIngester.java:204)
at org.apache.oodt.cas.crawl.ProductCrawler.ingest(ProductCrawler.java:304)
at
org.apache.oodt.cas.crawl.ProductCrawler.handleFile(ProductCrawler.java:188)
at org.apache.oodt.cas.crawl.ProductCrawler.crawl(ProductCrawler.java:108)
at org.apache.oodt.cas.crawl.ProductCrawler.crawl(ProductCrawler.java:75)
at
org.apache.oodt.cas.crawl.cli.action.CrawlerLauncherCliAction.execute(CrawlerLauncherCliAction.java:58)
at org.apache.oodt.cas.cli.CmdLineUtility.execute(CmdLineUtility.java:331)
at org.apache.oodt.cas.cli.CmdLineUtility.run(CmdLineUtility.java:187)
at org.apache.oodt.cas.crawl.CrawlerLauncher.main(CrawlerLauncher.java:36)
Oct 02, 2014 12:14:36 PM org.apache.oodt.cas.crawl.ProductCrawler handleFile
WARNING: Failed to ingest product:
[/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/data/staging/ops/eng/2014/002/JEDI_2014002183121000_2014002190215999.sfdu]:
performing postIngestFail actions
Valerie A. Mallder
New Horizons Deputy Mission System Engineer
The Johns Hopkins University/Applied Physics Laboratory
11100 Johns Hopkins Rd (MS 23-282), Laurel, MD 20723
240-228-7846 (Office) 410-504-2233 (Blackberry)