thanks Val

—
Chris Mattmann
[email protected]







On 4/6/16, 7:15 PM, "Mallder, Valerie" <[email protected]> wrote:

>I haven't had a chance to study this yet. But after a first pass through this 
>email trail I'm suspicious that Kostas may be running into the same problem I 
>ran into when tika was either introduced or upgraded to a much newer version 
>than had been in the system previously. I ended up having to modify my 
>mimetypes.xml file to get around the problem I was having after that happened. 
>But, I will look at this in detail tomorrow and compare it to my history of 
>debugging when I was going from versions 0.6 to 0.7 to 0.8 to 0.9 and 0.10 and 
>see if the problem is what I have seen before. However, I am staying at 0.10, 
>so I won't be able to speak for going up to version 0.12.
>
>Val
>
>
>
>Sent with Good (www.good.com)
>________________________________
>From: Chris Mattmann <[email protected]>
>Sent: Wednesday, April 6, 2016 9:58:15 PM
>To: [email protected]
>Subject: Re: Transition from OODT 0.6 to 0.12 cannot find extractor 
>specifications
>
>Thanks Kostas, they are wire compatible and this is a good
>use case.
>
>The crawler should not have undergone much update (perhaps at
>all) since 0.6, so am not exactly sure why you were seeing
>issues with it. There are definitely upgrades since 0.6 to CAS-PGE
>and maybe that’s what you were running into.
>
>
>—
>Chris Mattmann
>[email protected]
>
>
>
>
>
>
>
>On 4/6/16, 6:47 PM, "Konstantinos Mavrommatis" <[email protected]> 
>wrote:
>
>>I am giving up on this....
>>I had used [1] in the first place to setup oodt (v0.6 back then) my setup in 
>>the new system is identical to the old one.
>>I could not make much out of [0]. Among other things I tried to copy the 
>>files in the old crawler/policy to the new crawler/policy - which included 
>>some legacy-cmd-line-options.xml, legacy-cmd-line actions.xml. I also tried 
>>to reinstall the full oodt on the client side, but still did not work.
>>
>>I ended up reverting to the older version (0.6) which I run on my client. The 
>>server (which runs FM) is still 0.12, but the combination seems to be working 
>>fine.
>>
>>K
>>
>>-----Original Message-----
>>From: Lewis John Mcgibbney [mailto:[email protected]]
>>Sent: Tuesday, April 05, 2016 3:33 AM
>>To: [email protected]
>>Subject: Re: Transition from OODT 0.6 to 0.12 cannot find extractor 
>>specifications
>>
>>Hi K,
>>OK so I did a bit of searching here and located a bunch of files which are 
>>defined as legacy... you can check the search results out below 
>>https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_oodt_search-3Futf8-3D-25E2-259C-2593-26q-3DAutoDetectProductCrawler-26type-3DCode&d=CwIBaQ&c=CZZujK3G2KuXGEKnzq-Hrg&r=wndYZ4MLMT9l3Zb2WZv2hq2O6yvZ1Cs-T2gHY95y7ZA&m=AZOhzDmmNuBD_R9H2fm-CubVmid0OEJbXqk4G2cmzDs&s=B33E_m-BUEEQBqIqa2J8tZ2vnLqfwapWZp9Rn5nRyU8&e=
>>I would urge you to have a look at the AutoDetectProductCrawler Javadoc 
>>description included in master branch [0] as well to see if you've got 
>>everything required.
>>Finally, I came across some documentation on the wiki which may guide you in 
>>the right direction [1]. It may also be outdated though so please let us know 
>>if that it the case.
>>hth
>>
>>[0]
>>https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_oodt_blob_91d0bafe71124906bd94baad746189caf35fb39c_crawler_src_main_java_org_apache_oodt_cas_crawl_AutoDetectProductCrawler.java-23L40-2DL64&d=CwIBaQ&c=CZZujK3G2KuXGEKnzq-Hrg&r=wndYZ4MLMT9l3Zb2WZv2hq2O6yvZ1Cs-T2gHY95y7ZA&m=AZOhzDmmNuBD_R9H2fm-CubVmid0OEJbXqk4G2cmzDs&s=rJpNgTfZDhDyGV5KksACkvbSnkVvobGfBQcxXiLWwT4&e=
>>[1]
>>https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_OODT_Mime-2Btype-2Bdetection-2Bwith-2Bthe-2BAutoDetectProductCrawler&d=CwIBaQ&c=CZZujK3G2KuXGEKnzq-Hrg&r=wndYZ4MLMT9l3Zb2WZv2hq2O6yvZ1Cs-T2gHY95y7ZA&m=AZOhzDmmNuBD_R9H2fm-CubVmid0OEJbXqk4G2cmzDs&s=V5fEGERshX3JHBTQXryhwoEZqhgarILk8WutEwICmGs&e=
>>
>>On Mon, Apr 4, 2016 at 10:54 PM, Konstantinos Mavrommatis < 
>>[email protected]> wrote:
>>
>>> Hi,
>>> It seems to be happening for a number of types of files that I have in
>>> the mimetypes.xml.
>>> A few things are puzzling to me: this file which is a .gz file is not
>>> processed by the regular tika mimetypes which contains the gzip files
>>> A file that has no extension, which defaults to txt is passed to the
>>> MetExtractor.pl and processed.
>>>
>>> Any ideas I can find what are the preconditions that fail ? I tried to
>>> change the log level to DEBUG for all components but I did not get
>>> much more information. This must be something that changed in the OODT
>>> releases
>>> >0.6 but could not find anything relevant in the release notes.
>>> I also noticed in the documentation  of the AutoDecectProductCrawler
>>> that it uses the file met-extr-preconditions.xml which I could not
>>> find anywhere in the deployed OODT or the src directories. Could that
>>> be a reason for the problem I observe?
>>>
>>> Thanks
>>> K
>>>
>>> -----Original Message-----
>>> From: Lewis John Mcgibbney [mailto:[email protected]]
>>> Sent: Monday, April 04, 2016 3:24 PM
>>> To: [email protected]
>>> Subject: Re: Transition from OODT 0.6 to 0.12 cannot find extractor
>>> specifications
>>>
>>> Hi Konstantinos,
>>> It appears to be happening with a tar.gz file as well right?
>>>
>>> WARNING: No extractor specs specified for
>>> /mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/fast
>>> q/cas-crawler-04-02-16.log.gz
>>>
>>> I wonder if it is the file names... However I would be extremely
>>> surprised as I've seen some much more verbose file naming.
>>> Lewis
>>>
>>> On Saturday, April 2, 2016, Konstantinos Mavrommatis <
>>> [email protected]> wrote:
>>>
>>> > Hi,
>>> > I am trying to replicate a fully functional service that I had setup
>>> > long time ago using OODT 0.6 but I am having the following problem
>>> > that does not allow me to ingest files. When I try to ingest files
>>> > with the extension fastq.gz I get the line:
>>> > WARNING: No extractor specs specified for
>>> > /mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/fa
>>> > st q/E837642_R1.fastq.gz Apr 02, 2016 10:12:14 PM
>>> > org.apache.oodt.cas.crawl.ProductCrawler
>>> > handleFile
>>> > And of course the file is not ingested. This process works without
>>> > problem with OODT 0.6 on a different server.
>>> >
>>> > The crawler command I am running is:
>>> > ./crawler_launcher \
>>> > --operation \
>>> > --launchAutoCrawler \
>>> > --productPath $FILEPATH \
>>> > --filemgrUrl $OODT_FILEMGR_URL \
>>> > --clientTransferer
>>> > org.apache.oodt.cas.filemgr.datatransfer.InPlaceDataTransferFactory
>>> > \ --mimeExtractorRepo ../policy/mime-extractor-map.xml \ --noRecur \
>>> > --crawlForDirs 2>&1
>>> >
>>> >
>>> >
>>> > I have setup OODT 0.12 on a server which runs FM listening to port 9000.
>>> > From a client machine I have verified that I can use FM to ingest
>>> products.
>>> > I am now trying to use crawler to crawl and ingest all files in a
>>> > directory. Since I have non standard MIME types in these directories
>>> > I have done the following:
>>> > 1. Added my own mime types in policy/mimetypes.xml eg
>>> >   <mime-type type="text/fastq">
>>> >                 <glob pattern="*.fastq"/>
>>> >                 <glob pattern="*.fastq.gz"/>
>>> >                 <glob pattern="*.fastq.bz"/>
>>> >                 <glob pattern="*.fastq.bz2"/>
>>> >                 <glob pattern="*.fastq.bzip"/>
>>> >                 <glob pattern="*.fq"/>
>>> >                 <glob pattern="*.fq.gz"/>
>>> >                 <glob pattern="*.fq.bz"/>
>>> >                 <glob pattern="*.fq.bz2"/>
>>> >                 <glob pattern="*.fq.bzip"/>
>>> >         </mime-type>
>>> > 2. created the file policy/mime-extractor-map.xml
>>> >
>>> >         <mime type="text/fastq">
>>> >                 <extractor
>>> > class="org.apache.oodt.cas.metadata.extractors.ExternMetExtractor">
>>> >                         <config
>>> > file="/apache-oodt/crawler/bin/fastq.config"/>
>>> >                         <preCondComparators>
>>> >                                 <preCondComparator
>>> > id="CheckThatDataFileSizeIsGreaterThanZero"/>
>>> >                         </preCondComparators>
>>> >                 </extractor>
>>> >         </mime>
>>> >
>>> > 3. created the file fastq.config
>>> > <?xml version="1.0" encoding="UTF-8"?> <cas:externextractor
>>> > xmlns:cas="https://urldefense.proofpoint.com/v2/url?u=http-3A__oodt.jpl.nasa.gov_1.0_cas&d=CwIBaQ&c=CZZujK3G2KuXGEKnzq-Hrg&r=wndYZ4MLMT9l3Zb2WZv2hq2O6yvZ1Cs-T2gHY95y7ZA&m=AZOhzDmmNuBD_R9H2fm-CubVmid0OEJbXqk4G2cmzDs&s=FvkBYgoM8RnUm2ITaMjYb1s1sa9YtHvNL4c1M_KF06w&e=
>>> >  ">
>>> >   <exec workingDir="">
>>> >
>>> >
>>> <extractorBinPath>/apache-oodt/crawler/bin/MetExtractorNGS.pl</extract
>>> orBinPath>
>>> >       <args>
>>> >          <arg isDataFile="true"></arg>
>>> >         <arg>fastq</arg>
>>> >       </args>
>>> >    </exec>
>>> > </cas:externextractor>
>>> >
>>> >
>>> >
>>> > The MetExtractorNGS.pl is a small perl script that opens the file to
>>> > be ingested, gets some information and stores it in the .met file
>>> > that corresponds to the file to be ingested and have manually
>>> > verified that works as expected producing the correct met file.
>>> >
>>> > What am I missing here? Any ideas comments suggestions will be
>>> > greatly appreciated.
>>> > Thanks in advance for any help
>>> > Kostas
>>> >
>>> >
>>> >
>>> > PS1 The full output from running the crawler command follows:
>>> >
>>> >
>>> > Setting property 'StdProductCrawler.filemgrUrl'
>>> > Setting property 'MetExtractorProductCrawler.filemgrUrl'
>>> > Setting property 'AutoDetectProductCrawler.filemgrUrl'
>>> > Setting property 'StdProductCrawler.clientTransferer'
>>> > Setting property 'MetExtractorProductCrawler.clientTransferer'
>>> > Setting property 'AutoDetectProductCrawler.clientTransferer'
>>> > Setting property 'StdProductCrawler.noRecur'
>>> > Setting property 'MetExtractorProductCrawler.noRecur'
>>> > Setting property 'AutoDetectProductCrawler.noRecur'
>>> > Setting property 'AutoDetectProductCrawler.mimeExtractorRepo'
>>> > Setting property 'StdProductCrawler.productPath'
>>> > Setting property 'MetExtractorProductCrawler.productPath'
>>> > Setting property 'AutoDetectProductCrawler.productPath'
>>> > Apr 02, 2016 10:12:13 PM
>>> > org.springframework.beans.factory.config.PropertyOverrideConfigurer
>>> > processKey
>>> > FINE: Property 'AutoDetectProductCrawler.noRecur' set to value
>>> > [true] Apr 02, 2016 10:12:13 PM
>>> > org.springframework.beans.factory.config.PropertyOverrideConfigurer
>>> > processKey
>>> > FINE: Property 'StdProductCrawler.productPath' set to value
>>> > [/mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/f
>>> > as
>>> > tq]
>>> > Apr 02, 2016 10:12:13 PM
>>> > org.springframework.beans.factory.config.PropertyOverrideConfigurer
>>> > processKey
>>> > FINE: Property 'MetExtractorProductCrawler.noRecur' set to value
>>> > [true] Apr 02, 2016 10:12:13 PM
>>> > org.springframework.beans.factory.config.PropertyOverrideConfigurer
>>> > processKey
>>> > FINE: Property 'AutoDetectProductCrawler.mimeExtractorRepo' set to
>>> > value [../policy/mime-extractor-map.xml]
>>> > Apr 02, 2016 10:12:13 PM
>>> > org.springframework.beans.factory.config.PropertyOverrideConfigurer
>>> > processKey
>>> > FINE: Property 'MetExtractorProductCrawler.clientTransferer' set to
>>> > value
>>> > [org.apache.oodt.cas.filemgr.datatransfer.InPlaceDataTransferFactory
>>> > ]
>>> > Apr 02, 2016 10:12:13 PM
>>> > org.springframework.beans.factory.config.PropertyOverrideConfigurer
>>> > processKey
>>> > FINE: Property 'AutoDetectProductCrawler.filemgrUrl' set to value [
>>> > https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.8.44-3A9
>>> > 00
>>> > 0&d=CwIBaQ&c=CZZujK3G2KuXGEKnzq-Hrg&r=wndYZ4MLMT9l3Zb2WZv2hq2O6yvZ1C
>>> > s-
>>> > T2gHY95y7ZA&m=Qaz0eKz7FHe35NMF43A17ey59ANhAqJD5ZfwZQC0VRo&s=OvpwZVR1
>>> > Xq gKclL83VXAWh__c7nz87xK_nS-O7hIXqc&e= ] Apr 02, 2016 10:12:13 PM
>>> > org.springframework.beans.factory.config.PropertyOverrideConfigurer
>>> > processKey
>>> > FINE: Property 'AutoDetectProductCrawler.clientTransferer' set to
>>> > value
>>> > [org.apache.oodt.cas.filemgr.datatransfer.InPlaceDataTransferFactory
>>> > ]
>>> > Apr 02, 2016 10:12:13 PM
>>> > org.springframework.beans.factory.config.PropertyOverrideConfigurer
>>> > processKey
>>> > FINE: Property 'StdProductCrawler.noRecur' set to value [true] Apr
>>> > 02,
>>> > 2016 10:12:13 PM
>>> > org.springframework.beans.factory.config.PropertyOverrideConfigurer
>>> > processKey
>>> > FINE: Property 'StdProductCrawler.filemgrUrl' set to value [
>>> > https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.8.44-3A9
>>> > 00
>>> > 0&d=CwIBaQ&c=CZZujK3G2KuXGEKnzq-Hrg&r=wndYZ4MLMT9l3Zb2WZv2hq2O6yvZ1C
>>> > s-
>>> > T2gHY95y7ZA&m=Qaz0eKz7FHe35NMF43A17ey59ANhAqJD5ZfwZQC0VRo&s=OvpwZVR1
>>> > Xq gKclL83VXAWh__c7nz87xK_nS-O7hIXqc&e= ] Apr 02, 2016 10:12:13 PM
>>> > org.springframework.beans.factory.config.PropertyOverrideConfigurer
>>> > processKey
>>> > FINE: Property 'AutoDetectProductCrawler.productPath' set to value
>>> > [/mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/f
>>> > as
>>> > tq]
>>> > Apr 02, 2016 10:12:13 PM
>>> > org.springframework.beans.factory.config.PropertyOverrideConfigurer
>>> > processKey
>>> > FINE: Property 'MetExtractorProductCrawler.filemgrUrl' set to value
>>> > [
>>> > https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.8.44-3A9
>>> > 00
>>> > 0&d=CwIBaQ&c=CZZujK3G2KuXGEKnzq-Hrg&r=wndYZ4MLMT9l3Zb2WZv2hq2O6yvZ1C
>>> > s-
>>> > T2gHY95y7ZA&m=Qaz0eKz7FHe35NMF43A17ey59ANhAqJD5ZfwZQC0VRo&s=OvpwZVR1
>>> > Xq gKclL83VXAWh__c7nz87xK_nS-O7hIXqc&e= ] Apr 02, 2016 10:12:13 PM
>>> > org.springframework.beans.factory.config.PropertyOverrideConfigurer
>>> > processKey
>>> > FINE: Property 'StdProductCrawler.clientTransferer' set to value
>>> > [org.apache.oodt.cas.filemgr.datatransfer.InPlaceDataTransferFactory
>>> > ]
>>> > Apr 02, 2016 10:12:13 PM
>>> > org.springframework.beans.factory.config.PropertyOverrideConfigurer
>>> > processKey
>>> > FINE: Property 'MetExtractorProductCrawler.productPath' set to value
>>> > [/mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/f
>>> > as tq] Apr 02, 2016 10:12:13 PM
>>> > org.apache.oodt.cas.crawl.ProductCrawler
>>> > crawl
>>> > INFO: Crawling
>>> > /mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/fa
>>> > st q Apr 02, 2016 10:12:13 PM
>>> > org.apache.oodt.cas.crawl.ProductCrawler
>>> > handleFile
>>> > INFO: Handling file
>>> > /mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/fa
>>> > st
>>> > q/E837642_R1.fastq.gz
>>> > Apr 02, 2016 10:12:14 PM
>>> > org.apache.oodt.cas.crawl.AutoDetectProductCrawler
>>> > passesPreconditions
>>> > WARNING: No extractor specs specified for
>>> > /mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/fa
>>> > st q/E837642_R1.fastq.gz Apr 02, 2016 10:12:14 PM
>>> > org.apache.oodt.cas.crawl.ProductCrawler
>>> > handleFile
>>> > WARNING: Failed to pass preconditions for ingest of product:
>>> > [/mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/f
>>> > as tq/E837642_R1.fastq.gz] Apr 02, 2016 10:12:14 PM
>>> > org.apache.oodt.cas.crawl.ProductCrawler
>>> > handleFile
>>> > INFO: Handling file
>>> > /mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/fa
>>> > st
>>> > q/E837642_R1.fastq.gz.met
>>> > Apr 02, 2016 10:12:14 PM
>>> > org.apache.oodt.cas.metadata.preconditions.PreCondEvalUtils eval
>>> > INFO: Passed precondition comparator id
>>> > CheckThatDataFileSizeIsGreaterThanZero
>>> > Apr 02, 2016 10:12:14 PM
>>> > org.apache.oodt.cas.metadata.extractors.ExternMetExtractor
>>> > extrMetadata
>>> > INFO: Generating met file for product file:
>>> > [/mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/f
>>> > as
>>> > tq/E837642_R1.fastq.gz.met]
>>> > Apr 02, 2016 10:12:14 PM
>>> > org.apache.oodt.cas.metadata.extractors.ExternMetExtractor
>>> > extrMetadata
>>> > INFO: Executing command line:
>>> > [/celgene/software/apache-oodt/crawler/bin/MetExtractorNGS.pl
>>> > /mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/fa
>>> > st
>>> > q/E837642_R1.fastq.gz.met
>>> > text ] with workingDir:
>>> > [/mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/f
>>> > as
>>> > tq]
>>> > to extract metadata
>>> > OUTPUT: [WARN : MetExtractorNGS - 2016/04/02 22:12:15] - Input file
>>> > /mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/fa
>>> > st q/E837642_R1.fastq.gz.met will be ignored. .met files are not
>>> > processed !
>>> > Apr 02, 2016 10:12:15 PM org.apache.oodt.cas.crawl.ProductCrawler
>>> > handleFile
>>> > SEVERE: Failed to get metadata for product : Met extractor failed to
>>> > create metadata file
>>> > org.apache.oodt.cas.metadata.exceptions.MetExtractionException: Met
>>> > extractor failed to create metadata file
>>> >         at
>>> >
>>> org.apache.oodt.cas.metadata.extractors.ExternMetExtractor.extrMetadat
>>> a(ExternMetExtractor.java:120)
>>> >         at
>>> >
>>> org.apache.oodt.cas.metadata.AbstractMetExtractor.extractMetadata(Abst
>>> ractMetExtractor.java:74)
>>> >         at
>>> >
>>> org.apache.oodt.cas.crawl.AutoDetectProductCrawler.getMetadataForProdu
>>> ct(AutoDetectProductCrawler.java:84)
>>> >         at
>>> >
>>> org.apache.oodt.cas.crawl.ProductCrawler.handleFile(ProductCrawler.jav
>>> a:136)
>>> >         at
>>> > org.apache.oodt.cas.crawl.ProductCrawler.crawl(ProductCrawler.java:104)
>>> >         at
>>> > org.apache.oodt.cas.crawl.ProductCrawler.crawl(ProductCrawler.java:74)
>>> >         at
>>> >
>>> org.apache.oodt.cas.crawl.cli.action.CrawlerLauncherCliAction.execute(
>>> CrawlerLauncherCliAction.java:58)
>>> >         at
>>> > org.apache.oodt.cas.cli.CmdLineUtility.execute(CmdLineUtility.java:331)
>>> >         at
>>> > org.apache.oodt.cas.cli.CmdLineUtility.run(CmdLineUtility.java:188)
>>> >         at
>>> > org.apache.oodt.cas.crawl.CrawlerLauncher.main(CrawlerLauncher.java:
>>> > 36
>>> > )
>>> >
>>> > Apr 02, 2016 10:12:15 PM org.apache.oodt.cas.crawl.ProductCrawler
>>> > handleFile
>>> > INFO: Handling file
>>> > /mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/fa
>>> > st
>>> > q/E837642_R2.fastq.gz
>>> > Apr 02, 2016 10:12:15 PM
>>> > org.apache.oodt.cas.crawl.AutoDetectProductCrawler
>>> > passesPreconditions
>>> > WARNING: No extractor specs specified for
>>> > /mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/fa
>>> > st q/E837642_R2.fastq.gz Apr 02, 2016 10:12:15 PM
>>> > org.apache.oodt.cas.crawl.ProductCrawler
>>> > handleFile
>>> > WARNING: Failed to pass preconditions for ingest of product:
>>> > [/mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/f
>>> > as tq/E837642_R2.fastq.gz] Apr 02, 2016 10:12:15 PM
>>> > org.apache.oodt.cas.crawl.ProductCrawler
>>> > handleFile
>>> > INFO: Handling file
>>> > /mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/fa
>>> > st
>>> > q/E837642_R2.fastq.gz.met
>>> > Apr 02, 2016 10:12:15 PM
>>> > org.apache.oodt.cas.metadata.preconditions.PreCondEvalUtils eval
>>> > INFO: Passed precondition comparator id
>>> > CheckThatDataFileSizeIsGreaterThanZero
>>> > Apr 02, 2016 10:12:16 PM
>>> > org.apache.oodt.cas.metadata.extractors.ExternMetExtractor
>>> > extrMetadata
>>> > INFO: Generating met file for product file:
>>> > [/mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/f
>>> > as
>>> > tq/E837642_R2.fastq.gz.met]
>>> > Apr 02, 2016 10:12:16 PM
>>> > org.apache.oodt.cas.metadata.extractors.ExternMetExtractor
>>> > extrMetadata
>>> > INFO: Executing command line:
>>> > [/celgene/software/apache-oodt/crawler/bin/MetExtractorNGS.pl
>>> > /mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/fa
>>> > st
>>> > q/E837642_R2.fastq.gz.met
>>> > text ] with workingDir:
>>> > [/mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/f
>>> > as
>>> > tq]
>>> > to extract metadata
>>> > OUTPUT: [WARN : MetExtractorNGS - 2016/04/02 22:12:16] - Input file
>>> > /mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/fa
>>> > st q/E837642_R2.fastq.gz.met will be ignored. .met files are not
>>> > processed !
>>> > Apr 02, 2016 10:12:17 PM org.apache.oodt.cas.crawl.ProductCrawler
>>> > handleFile
>>> > SEVERE: Failed to get metadata for product : Met extractor failed to
>>> > create metadata file
>>> > org.apache.oodt.cas.metadata.exceptions.MetExtractionException: Met
>>> > extractor failed to create metadata file
>>> >         at
>>> >
>>> org.apache.oodt.cas.metadata.extractors.ExternMetExtractor.extrMetadat
>>> a(ExternMetExtractor.java:120)
>>> >         at
>>> >
>>> org.apache.oodt.cas.metadata.AbstractMetExtractor.extractMetadata(Abst
>>> ractMetExtractor.java:74)
>>> >         at
>>> >
>>> org.apache.oodt.cas.crawl.AutoDetectProductCrawler.getMetadataForProdu
>>> ct(AutoDetectProductCrawler.java:84)
>>> >         at
>>> >
>>> org.apache.oodt.cas.crawl.ProductCrawler.handleFile(ProductCrawler.jav
>>> a:136)
>>> >         at
>>> > org.apache.oodt.cas.crawl.ProductCrawler.crawl(ProductCrawler.java:104)
>>> >         at
>>> > org.apache.oodt.cas.crawl.ProductCrawler.crawl(ProductCrawler.java:74)
>>> >         at
>>> >
>>> org.apache.oodt.cas.crawl.cli.action.CrawlerLauncherCliAction.execute(
>>> CrawlerLauncherCliAction.java:58)
>>> >         at
>>> > org.apache.oodt.cas.cli.CmdLineUtility.execute(CmdLineUtility.java:331)
>>> >         at
>>> > org.apache.oodt.cas.cli.CmdLineUtility.run(CmdLineUtility.java:188)
>>> >         at
>>> > org.apache.oodt.cas.crawl.CrawlerLauncher.main(CrawlerLauncher.java:
>>> > 36
>>> > )
>>> >
>>> > Apr 02, 2016 10:12:17 PM org.apache.oodt.cas.crawl.ProductCrawler
>>> > handleFile
>>> > INFO: Handling file
>>> > /mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/fa
>>> > st
>>> > q/cas-crawler-04-02-16.log.gz
>>> > Apr 02, 2016 10:12:17 PM
>>> > org.apache.oodt.cas.crawl.AutoDetectProductCrawler
>>> > passesPreconditions
>>> > WARNING: No extractor specs specified for
>>> > /mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/fa
>>> > st q/cas-crawler-04-02-16.log.gz Apr 02, 2016 10:12:17 PM
>>> > org.apache.oodt.cas.crawl.ProductCrawler
>>> > handleFile
>>> > WARNING: Failed to pass preconditions for ingest of product:
>>> > [/mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/f
>>> > as tq/cas-crawler-04-02-16.log.gz] Apr 02, 2016 10:12:17 PM
>>> > org.apache.oodt.cas.crawl.ProductCrawler
>>> > handleFile
>>> > INFO: Handling file
>>> > /mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/fa
>>> > st
>>> > q/cas-crawler-04-02-16.tar.gz
>>> > Apr 02, 2016 10:12:17 PM
>>> > org.apache.oodt.cas.crawl.AutoDetectProductCrawler
>>> > passesPreconditions
>>> > WARNING: No extractor specs specified for
>>> > /mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/fa
>>> > st q/cas-crawler-04-02-16.tar.gz Apr 02, 2016 10:12:17 PM
>>> > org.apache.oodt.cas.crawl.ProductCrawler
>>> > handleFile
>>> > WARNING: Failed to pass preconditions for ingest of product:
>>> > [/mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/f
>>> > as tq/cas-crawler-04-02-16.tar.gz] Apr 02, 2016 10:12:17 PM
>>> > org.apache.oodt.cas.crawl.ProductCrawler
>>> > handleFile
>>> > INFO: Handling file
>>> > /mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/fa
>>> > st
>>> > q/cas-crawler-mnt-celgene.rnd.combio.mmgp.external-TestSeqData-RNA-S
>>> > eq
>>> > -RawData-fastq-04-02-16.tar.gz
>>> > Apr 02, 2016 10:12:17 PM
>>> > org.apache.oodt.cas.crawl.AutoDetectProductCrawler
>>> > passesPreconditions
>>> > WARNING: No extractor specs specified for
>>> > /mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/fa
>>> > st
>>> > q/cas-crawler-mnt-celgene.rnd.combio.mmgp.external-TestSeqData-RNA-S
>>> > eq -RawData-fastq-04-02-16.tar.gz Apr 02, 2016 10:12:17 PM
>>> > org.apache.oodt.cas.crawl.ProductCrawler
>>> > handleFile
>>> > WARNING: Failed to pass preconditions for ingest of product:
>>> > [/mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/f
>>> > as
>>> > tq/cas-crawler-mnt-celgene.rnd.combio.mmgp.external-TestSeqData-RNA-
>>> > Se q-RawData-fastq-04-02-16.tar.gz] Apr 02, 2016 10:12:17 PM
>>> > org.apache.oodt.cas.crawl.ProductCrawler
>>> > handleFile
>>> > INFO: Handling file
>>> > /mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/fa
>>> > st
>>> > q/test
>>> > Apr 02, 2016 10:12:17 PM
>>> > org.apache.oodt.cas.metadata.preconditions.PreCondEvalUtils eval
>>> > INFO: Passed precondition comparator id
>>> > CheckThatDataFileSizeIsGreaterThanZero
>>> > Apr 02, 2016 10:12:17 PM
>>> > org.apache.oodt.cas.metadata.extractors.ExternMetExtractor
>>> > extrMetadata
>>> > INFO: Generating met file for product file:
>>> > [/mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/f
>>> > as
>>> > tq/test]
>>> > Apr 02, 2016 10:12:17 PM
>>> > org.apache.oodt.cas.metadata.extractors.ExternMetExtractor
>>> > extrMetadata
>>> > INFO: Executing command line:
>>> > [/celgene/software/apache-oodt/crawler/bin/MetExtractorNGS.pl
>>> > /mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/fa
>>> > st
>>> > q/test
>>> > text ] with workingDir:
>>> > [/mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/f
>>> > as
>>> > tq]
>>> > to extract metadata
>>> > OUTPUT: [DEBUG : MetExtractorNGS - 2016/04/02 22:12:18] - Accessing
>>> > NGS server at
>>> > https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.8.44-3A8
>>> > 08
>>> > 2_RPC2&d=CwIBaQ&c=CZZujK3G2KuXGEKnzq-Hrg&r=wndYZ4MLMT9l3Zb2WZv2hq2O6
>>> > yv
>>> > Z1Cs-T2gHY95y7ZA&m=Qaz0eKz7FHe35NMF43A17ey59ANhAqJD5ZfwZQC0VRo&s=tSc
>>> > i2 Q1bJj0cQnBHjjOwtZjjx9uNMoN5Bi-ABG0Q7Y4&e=
>>> > OUTPUT: [DEBUG : metadataPrepare - 2016/04/02 22:12:18] - addMetadata:
>>> > metadata for file_host are not in array format.Converting..
>>> > OUTPUT: [DEBUG : metadataPrepare - 2016/04/02 22:12:18] - addMetadata:
>>> > adding key/value [file_host]/[ip-192-168-8-66]
>>> > OUTPUT: [DEBUG : metadataPrepare - 2016/04/02 22:12:18] - addMetadata:
>>> > metadata for ProductType are not in array format.Converting..
>>> > OUTPUT: [DEBUG : metadataPrepare - 2016/04/02 22:12:18] - addMetadata:
>>> > adding key/value [ProductType]/[GenericFile]
>>> > OUTPUT: [DEBUG : metadataPrepare - 2016/04/02 22:12:18] - addMetadata:
>>> > metadata for ingest_user are not in array format.Converting..
>>> > OUTPUT: [DEBUG : metadataPrepare - 2016/04/02 22:12:18] - addMetadata:
>>> > adding key/value [ingest_user]/[kmavrommatis]
>>> > OUTPUT: [DEBUG : MetExtractorNGS - 2016/04/02 22:12:18] - The file
>>> > path is ARRAY(0x22d3f48). It will be added under the FilePath
>>> > metadata field
>>> > OUTPUT: [DEBUG : metadataPrepare - 2016/04/02 22:12:18] - addMetadata:
>>> > metadata for FilePath are not in array format.Converting..
>>> > OUTPUT: [DEBUG : metadataPrepare - 2016/04/02 22:12:18] - addMetadata:
>>> > adding key/value
>>> > [FilePath]/[/mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Se
>>> > q/
>>> > RawData/fastq/test]
>>> > OUTPUT: [DEBUG : MetExtractorNGS - 2016/04/02 22:12:18] - This file
>>> > is of type text
>>> > OUTPUT: [DEBUG : MetExtractorNGS - 2016/04/02 22:12:18] - Storing
>>> > metadata in file
>>> > /mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/fa
>>> > st
>>> > q/test.met
>>> > OUTPUT: [DEBUG : metadataPrepare - 2016/04/02 22:12:18] - Changing
>>> > /mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/fa
>>> > st
>>> > q/test
>>> > to
>>> > OUTPUT: [DEBUG : metadataPrepare - 2016/04/02 22:12:18] -
>>> > /mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/fa
>>> > st
>>> > q/test
>>> > OUTPUT: [DEBUG : metadataPrepare - 2016/04/02 22:12:18] - Changing
>>> > kmavrommatis to
>>> > OUTPUT: [DEBUG : metadataPrepare - 2016/04/02 22:12:18] -
>>> > kmavrommatis
>>> > OUTPUT: [DEBUG : metadataPrepare - 2016/04/02 22:12:18] - Changing
>>> > GenericFile to
>>> > OUTPUT: [DEBUG : metadataPrepare - 2016/04/02 22:12:18] -
>>> > GenericFile
>>> > OUTPUT: [DEBUG : metadataPrepare - 2016/04/02 22:12:18] - Changing
>>> > ip-192-168-8-66 to
>>> > OUTPUT: [DEBUG : metadataPrepare - 2016/04/02 22:12:18] -
>>> > ip-192-168-8-66
>>> > OUTPUT: [DEBUG : MetExtractorNGS - 2016/04/02 22:12:19] - Process
>>> > finished SUCCESSFULLY Apr 02, 2016 10:12:19 PM
>>> > org.apache.oodt.cas.metadata.extractors.ExternMetExtractor
>>> > extrMetadata
>>> > INFO: Met extraction successful for product file:
>>> > [/mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/f
>>> > as
>>> > tq/test] Apr 02, 2016 10:12:19 PM
>>> > org.apache.oodt.cas.crawl.ProductCrawler ingest
>>> > INFO: ProductCrawler: Ready to ingest product:
>>> >
>>> [/mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/fastq/test]:
>>> > ProductType: [GenericFile]
>>> > Apr 02, 2016 10:12:19 PM
>>> > org.apache.oodt.cas.filemgr.ingest.StdIngester
>>> > setFileManager
>>> > INFO: StdIngester: connected to file manager:
>>> > [https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.8.44-3A
>>> > 90
>>> > 00&d=CwIBaQ&c=CZZujK3G2KuXGEKnzq-Hrg&r=wndYZ4MLMT9l3Zb2WZv2hq2O6yvZ1
>>> > Cs
>>> > -T2gHY95y7ZA&m=Qaz0eKz7FHe35NMF43A17ey59ANhAqJD5ZfwZQC0VRo&s=OvpwZVR
>>> > 1X qgKclL83VXAWh__c7nz87xK_nS-O7hIXqc&e= ] Apr 02, 2016 10:12:19 PM
>>> > org.apache.oodt.cas.filemgr.datatransfer.InPlaceDataTransferer
>>> > setFileManagerUrl
>>> > INFO: In Place Data Transfer to:
>>> > [https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.8.44-3A
>>> > 90
>>> > 00&d=CwIBaQ&c=CZZujK3G2KuXGEKnzq-Hrg&r=wndYZ4MLMT9l3Zb2WZv2hq2O6yvZ1
>>> > Cs
>>> > -T2gHY95y7ZA&m=Qaz0eKz7FHe35NMF43A17ey59ANhAqJD5ZfwZQC0VRo&s=OvpwZVR
>>> > 1X qgKclL83VXAWh__c7nz87xK_nS-O7hIXqc&e= ] enabled Apr 02, 2016
>>> > 10:12:19 PM org.apache.oodt.cas.filemgr.ingest.StdIngester
>>> > ingest
>>> > INFO: StdIngester: ingesting product: ProductName: [test]: ProductType:
>>> > [GenericFile]: FileLocation:
>>> > [/mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/f
>>> > as
>>> > tq/]
>>> > Apr 02, 2016 10:12:19 PM
>>> > org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient
>>> > ingestProduct
>>> > FINEST: File Manager Client: clientTransfer enabled: transfering
>>> > product [test] Apr 02, 2016 10:12:19 PM
>>> > org.apache.oodt.cas.filemgr.versioning.VersioningUtils
>>> > createBasicDataStoreRefsFlat
>>> > FINE: VersioningUtils: Generated data store ref:
>>> > file:/opt/oodt/data/archive/test/test from origRef:
>>> > file:/mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawDa
>>> > ta /fastq/test Apr 02, 2016 10:12:19 PM
>>> > org.apache.oodt.cas.crawl.ProductCrawler ingest
>>> > INFO: Successfully ingested product:
>>> >
>>> [/mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/fastq/test]:
>>> > product id: 4c8de2da-265a-48c4-8380-3f1103dfecfc
>>> > Apr 02, 2016 10:12:19 PM org.apache.oodt.cas.crawl.ProductCrawler
>>> > handleFile
>>> > INFO: Successful ingest of product:
>>> > [/mnt/celgene.rnd.combio.mmgp.external/TestSeqData/RNA-Seq/RawData/f
>>> > as
>>> > tq/test]
>>> >
>>> >
>>> > *********************************************************
>>> > THIS ELECTRONIC MAIL MESSAGE AND ANY ATTACHMENT IS CONFIDENTIAL AND
>>> > MAY CONTAIN LEGALLY PRIVILEGED INFORMATION INTENDED ONLY FOR THE USE
>>> > OF THE INDIVIDUAL OR INDIVIDUALS NAMED ABOVE.
>>> > If the reader is not the intended recipient, or the employee or
>>> > agent responsible to deliver it to the intended recipient, you are
>>> > hereby notified that any dissemination, distribution or copying of
>>> > this communication is strictly prohibited. If you have received this
>>> > communication in error, please reply to the sender to notify us of
>>> > the error and delete the original message. Thank You.
>>> >
>>>
>>>
>>> --
>>> *Lewis*
>>>
>>> *********************************************************
>>> THIS ELECTRONIC MAIL MESSAGE AND ANY ATTACHMENT IS CONFIDENTIAL AND
>>> MAY CONTAIN LEGALLY PRIVILEGED INFORMATION INTENDED ONLY FOR THE USE
>>> OF THE INDIVIDUAL OR INDIVIDUALS NAMED ABOVE.
>>> If the reader is not the intended recipient, or the employee or agent
>>> responsible to deliver it to the intended recipient, you are hereby
>>> notified that any dissemination, distribution or copying of this
>>> communication is strictly prohibited. If you have received this
>>> communication in error, please reply to the sender to notify us of the
>>> error and delete the original message. Thank You.
>>>
>>
>>
>>
>>--
>>*Lewis*
>>*********************************************************
>>THIS ELECTRONIC MAIL MESSAGE AND ANY ATTACHMENT IS
>>CONFIDENTIAL AND MAY CONTAIN LEGALLY PRIVILEGED
>>INFORMATION INTENDED ONLY FOR THE USE OF THE INDIVIDUAL
>>OR INDIVIDUALS NAMED ABOVE.
>>If the reader is not the intended recipient, or the
>>employee or agent responsible to deliver it to the
>>intended recipient, you are hereby notified that any
>>dissemination, distribution or copying of this
>>communication is strictly prohibited. If you have
>>received this communication in error, please reply to the
>>sender to notify us of the error and delete the original
>>message. Thank You.
>

Reply via email to