On 03/26/2014 10:06 PM, Lou DeGenaro wrote:
Hi Reshu,
re: your answers to 5 & 6
6a. Is the data that populates the CAS the "name" of a document or the
document itself? (The expected expected use of DUCC is to *not* pass the
document contents which may, for example, be very large)
6b. If it is a "name" or the like, is that something you can share so I
can try to reproduce here?
Lou.
On Wed, Mar 26, 2014 at 9:20 AM, reshu.agarwal <[email protected]>wrote:
Hi Lou,
On 03/26/2014 04:27 PM, Lou DeGenaro wrote:
Hi Reshu,
The good news is that DUCC is functional since 1.job works. So we need to
find out why your particular job fails.
A few more questions:
5. Does your job consist of multiple work items (CASes), and do any of
them
succeed?
My job consists of multiple work Items as well as I have tried a job with
single document. These both type of jobs are succeeded many times but I got
a problem like this on a particular document with in job. if I exclude this
document, my job got succeeded.
6. DUCC has Job Driver (JD) that employs your CollectionReader (CR) to
fetch CASes that are sent via a broker for processing by one of the
distributed Job Processes (JPs) that each run a copy of your
AnaylsisEngine
(AE). Normally, as Eddie points out, these CASes comprise some index
that's interpreted by the assigned JP to know which data is to be worked
on. For example, say you have 100 documents, each 5GB in size named
doc.1,
doc.2,...doc.100. Your CR sound not pass the actual 5GB document, but
rather "doc.1". Is that the kind of scheme your are employing?
Lou, I am fetching Batch data from Database and sending reference from the
result set to Cas. I am not using File Processing.
7. Do you have a small test case that you can share that reliably
demonstrates the problem?
Test Case:
I have two systems with in DUCC cluster with 20 GB RAM each.
I have defined job with these configurations:
classpath_order ducc-before-user
driver_descriptor_CR ../collection_reader/DBCollectionReader.xml
process_deployments_max 6
process_descriptor_AE ../aeAggregate
process_descriptor_CC ../cas_consumer/CASConsumer
process_failures_limit 50
process_memory_size 4
process_per_item_time_max 3
process_thread_count 3
specification 22.job
working_directory ../ducc/Uima_ducc
I am fetching Data from Database in CR. After executing getNext() method
of CR for the particular document, It prints warning message in JD.log like
this
Mar 26, 2014 9:40:25 AM org.apache.uima.adapter.jms.client.
BaseUIMAAsynchronousEngineCommon_impl sendAndReceiveCAS
WARNING:
The document remains in queue till 5 minutes i.e. equals to the queue
waiting time.
Then if the batch size is 100 it shows lost=1 else if 200 then it still
remain in queue until I forcefully terminate the job.
Lou.
On Wed, Mar 26, 2014 at 5:31 AM, reshu.agarwal<[email protected]
wrote:
On 03/20/2014 06:35 PM, Lou DeGenaro wrote:
Where does the warning appear, in a log file in the job's log
directory? Is there any other information related to that warning?
Hi Lou,
Answers of your questions are given below. Hope it will help:
1. Are you able to run a simple job, such as 1.job from the examples
directory successfully?
Yes, I am able to run that simple job successfully.
2. Where does the warning appear, in a log file in the job's log
directory? Is there any other information related to that warning?
This warning appears in JD.log file.
After all initialization messages and these messages come:
Mar 26, 2014 9:49:04 AM org.apache.uima.adapter.jms.client.
BaseUIMAAsynchronousEngine_impl setupConnection
INFO: UIMA AS Client Created Shared Connection To Broker:
tcp://S1:61616?wireFormat.maxInactivityDuration=0&jms.
useCompression=true&
closeAsync=false
Mar 26, 2014 9:49:04 AM org.apache.uima.adapter.jms.client.
BaseUIMAAsynchronousEngine_impl initializeProducer
INFO: Initializing JMS Message Producer. Broker:
tcp://S1:61616?wireFormat.
maxInactivityDuration=0&jms.useCompression=true&closeAsync=false Queue
Name: ducc.jd.queue.1317
Mar 26, 2014 9:49:04 AM org.apache.uima.adapter.jms.client.
BaseUIMAAsynchronousEngine_impl initializeConsumer
INFO: Initializing JMS Message Consumer. Broker:
tcp://S1:61616?wireFormat.
maxInactivityDuration=0&jms.useCompression=true&closeAsync=false Queue
Name: ID:S144-36678-1395807465286-7:1:1
Mar 26, 2014 9:49:04 AM org.apache.uima.adapter.jms.client.
BaseUIMAAsynchronousEngine_impl initialize
INFO: Asynchronous Client Has Been Initialized. Serialization Strategy:
[SerializationStrategy] Ready To Process.
and then only this warning message comes:
Mar 26, 2014 9:49:27 AM org.apache.uima.adapter.jms.client.
BaseUIMAAsynchronousEngineCommon_impl sendAndReceiveCAS
WARNING:
then this messages come:
Mar 26, 2014 9:59:45 AM org.apache.uima.adapter.jms.client.
BaseUIMAAsynchronousEngineCommon_impl stop
INFO: Stopping Asynchronous Client.
Mar 26, 2014 9:59:45 AM org.apache.uima.adapter.jms.client.
BaseUIMAAsynchronousEngineCommon_impl stop
INFO: Asynchronous Client Has Stopped.
Mar 26, 2014 9:59:45 AM org.apache.uima.adapter.jms.client.
BaseUIMAAsynchronousEngineCommon_impl$SharedConnection destroy
INFO: UIMA AS Client Shared Connection Has Been Closed Mar 26, 2014
9:59:45 AM org.apache.uima.adapter.jms.client.
BaseUIMAAsynchronousEngine_impl
stop
3. Are there any exceptions in any of the logs in the job's log
directory?
Yes, When this warning message comes then after successfully processing
of
all documents from DB collection Reader instead of this particular
document. This Message shows in one of the Process's log file i.e.:
Mar 26, 2014 9:54:04 AM org.apache.uima.adapter.jms.
activemq.JmsOutputChannel$ConnectionTimer startSessionReaperTimer.run
INFO: Thread: 210 Component: CorefernceAggDescriptor Jms Session
Inactivity Timeout: 5 Minutes on Broker: tcp://S1:61616?wireFormat.
maxInactivityDuration=0&closeAsync=false
I think this is due to that warning.
4. Does your job use a version of UIMA/UIMA-AS that is different than the
one used by DUCC?
I am using DUCC version 1.0.0 and UIMA version 2.4.2. I am not able to
get
DUCC UIMA version.
--
Thanks and Regards,
Reshu Agarwal
Software Engineer
Orkash Services Pvt Ltd
Reshu.
Hi Lou,
I am sending the reference of document like the code given below:
String originalText = v_result.getString("content").toString();
//v_result is the object of ResultSet of Database
JCas jcas;
try {
jcas = aCAS.getJCas();
} catch (CASException e) {
throw new CollectionException(e);
}
jcas.setDocumentText(originalText);
--
Thanks,
Reshu Agarwal