Hi Reshu, re: your answers to 5 & 6
6a. Is the data that populates the CAS the "name" of a document or the document itself? (The expected expected use of DUCC is to *not* pass the document contents which may, for example, be very large) 6b. If it is a "name" or the like, is that something you can share so I can try to reproduce here? Lou. On Wed, Mar 26, 2014 at 9:20 AM, reshu.agarwal <[email protected]>wrote: > > Hi Lou, > > > On 03/26/2014 04:27 PM, Lou DeGenaro wrote: > >> Hi Reshu, >> >> The good news is that DUCC is functional since 1.job works. So we need to >> find out why your particular job fails. >> >> A few more questions: >> >> 5. Does your job consist of multiple work items (CASes), and do any of >> them >> succeed? >> > My job consists of multiple work Items as well as I have tried a job with > single document. These both type of jobs are succeeded many times but I got > a problem like this on a particular document with in job. if I exclude this > document, my job got succeeded. > > > 6. DUCC has Job Driver (JD) that employs your CollectionReader (CR) to >> fetch CASes that are sent via a broker for processing by one of the >> distributed Job Processes (JPs) that each run a copy of your >> AnaylsisEngine >> (AE). Normally, as Eddie points out, these CASes comprise some index >> that's interpreted by the assigned JP to know which data is to be worked >> on. For example, say you have 100 documents, each 5GB in size named >> doc.1, >> doc.2,...doc.100. Your CR sound not pass the actual 5GB document, but >> rather "doc.1". Is that the kind of scheme your are employing? >> > Lou, I am fetching Batch data from Database and sending reference from the > result set to Cas. I am not using File Processing. > > 7. Do you have a small test case that you can share that reliably >> demonstrates the problem? >> > Test Case: > > I have two systems with in DUCC cluster with 20 GB RAM each. > I have defined job with these configurations: > > classpath_order ducc-before-user > driver_descriptor_CR ../collection_reader/DBCollectionReader.xml > process_deployments_max 6 > process_descriptor_AE ../aeAggregate > process_descriptor_CC ../cas_consumer/CASConsumer > process_failures_limit 50 > process_memory_size 4 > process_per_item_time_max 3 > process_thread_count 3 > specification 22.job > working_directory ../ducc/Uima_ducc > > > > I am fetching Data from Database in CR. After executing getNext() method > of CR for the particular document, It prints warning message in JD.log like > this > > Mar 26, 2014 9:40:25 AM org.apache.uima.adapter.jms.client. > BaseUIMAAsynchronousEngineCommon_impl sendAndReceiveCAS > WARNING: > > The document remains in queue till 5 minutes i.e. equals to the queue > waiting time. > > Then if the batch size is 100 it shows lost=1 else if 200 then it still > remain in queue until I forcefully terminate the job. > > > >> Lou. >> >> >> >> >> On Wed, Mar 26, 2014 at 5:31 AM, reshu.agarwal<[email protected] >> >wrote: >> >> On 03/20/2014 06:35 PM, Lou DeGenaro wrote: >>> >>> Where does the warning appear, in a log file in the job's log >>>> directory? Is there any other information related to that warning? >>>> >>>> Hi Lou, >>> >>> Answers of your questions are given below. Hope it will help: >>> >>> >>> 1. Are you able to run a simple job, such as 1.job from the examples >>> directory successfully? >>> >>> Yes, I am able to run that simple job successfully. >>> >>> >>> 2. Where does the warning appear, in a log file in the job's log >>> directory? Is there any other information related to that warning? >>> >>> This warning appears in JD.log file. >>> >>> After all initialization messages and these messages come: >>> >>> Mar 26, 2014 9:49:04 AM org.apache.uima.adapter.jms.client. >>> BaseUIMAAsynchronousEngine_impl setupConnection >>> INFO: UIMA AS Client Created Shared Connection To Broker: >>> tcp://S1:61616?wireFormat.maxInactivityDuration=0&jms. >>> useCompression=true& >>> closeAsync=false >>> Mar 26, 2014 9:49:04 AM org.apache.uima.adapter.jms.client. >>> BaseUIMAAsynchronousEngine_impl initializeProducer >>> INFO: Initializing JMS Message Producer. Broker: >>> tcp://S1:61616?wireFormat. >>> maxInactivityDuration=0&jms.useCompression=true&closeAsync=false Queue >>> Name: ducc.jd.queue.1317 >>> Mar 26, 2014 9:49:04 AM org.apache.uima.adapter.jms.client. >>> BaseUIMAAsynchronousEngine_impl initializeConsumer >>> INFO: Initializing JMS Message Consumer. Broker: >>> tcp://S1:61616?wireFormat. >>> maxInactivityDuration=0&jms.useCompression=true&closeAsync=false Queue >>> Name: ID:S144-36678-1395807465286-7:1:1 >>> Mar 26, 2014 9:49:04 AM org.apache.uima.adapter.jms.client. >>> BaseUIMAAsynchronousEngine_impl initialize >>> INFO: Asynchronous Client Has Been Initialized. Serialization Strategy: >>> [SerializationStrategy] Ready To Process. >>> >>> and then only this warning message comes: >>> >>> Mar 26, 2014 9:49:27 AM org.apache.uima.adapter.jms.client. >>> BaseUIMAAsynchronousEngineCommon_impl sendAndReceiveCAS >>> WARNING: >>> then this messages come: >>> >>> Mar 26, 2014 9:59:45 AM org.apache.uima.adapter.jms.client. >>> BaseUIMAAsynchronousEngineCommon_impl stop >>> INFO: Stopping Asynchronous Client. >>> Mar 26, 2014 9:59:45 AM org.apache.uima.adapter.jms.client. >>> BaseUIMAAsynchronousEngineCommon_impl stop >>> INFO: Asynchronous Client Has Stopped. >>> Mar 26, 2014 9:59:45 AM org.apache.uima.adapter.jms.client. >>> BaseUIMAAsynchronousEngineCommon_impl$SharedConnection destroy >>> INFO: UIMA AS Client Shared Connection Has Been Closed Mar 26, 2014 >>> 9:59:45 AM org.apache.uima.adapter.jms.client. >>> BaseUIMAAsynchronousEngine_impl >>> stop >>> >>> >>> >>> 3. Are there any exceptions in any of the logs in the job's log >>> directory? >>> >>> Yes, When this warning message comes then after successfully processing >>> of >>> all documents from DB collection Reader instead of this particular >>> document. This Message shows in one of the Process's log file i.e.: >>> >>> Mar 26, 2014 9:54:04 AM org.apache.uima.adapter.jms. >>> activemq.JmsOutputChannel$ConnectionTimer startSessionReaperTimer.run >>> INFO: Thread: 210 Component: CorefernceAggDescriptor Jms Session >>> Inactivity Timeout: 5 Minutes on Broker: tcp://S1:61616?wireFormat. >>> maxInactivityDuration=0&closeAsync=false >>> >>> I think this is due to that warning. >>> >>> >>> 4. Does your job use a version of UIMA/UIMA-AS that is different than the >>> one used by DUCC? >>> >>> I am using DUCC version 1.0.0 and UIMA version 2.4.2. I am not able to >>> get >>> DUCC UIMA version. >>> >>> >>> -- >>> Thanks and Regards, >>> Reshu Agarwal >>> Software Engineer >>> Orkash Services Pvt Ltd >>> >>> >>> > Reshu. >
