The issue seemed weird to me as well. It was not reproducible and so I just assumed that something must have gone wrong with the installation.

I had this issue occur in January and it just happened again over the weekend. This was using Ignite 1.5.0.final.

I've verified that all the nodes are configured using FifoQueueCollisionSpi with parallelJobsNumber = 1.

The nodes which execute the jobs are configured via xml:
   ... <property name="collisionSpi">
<bean class="org.apache.ignite.spi.collision.fifoqueue.FifoQueueCollisionSpi">
                <property name="parallelJobsNumber" value="1"/>
            </bean>
        </property> ...

Based on your previous response I believe the collisionSPI on the node submitting the task does not matter. Just in case that node also has the SPI configured:
            IgniteConfiguration igniteConfig = new IgniteConfiguration();
            igniteConfig.setMarshaller(new OptimizedMarshaller());
            igniteConfig.setMetricsLogFrequency(3600000);
            FifoQueueCollisionSpi colSpi = new FifoQueueCollisionSpi();
            colSpi.setParallelJobsNumber(1);
            igniteConfig.setCollisionSpi(colSpi);
            ...

On the previous occurrence of this bug I added this code to the job execution:
        CollisionSpi collisionSpi = grid.configuration().getCollisionSpi();
        if (collisionSpi instanceof FifoQueueCollisionSpi) {
FifoQueueCollisionSpi fifo = (FifoQueueCollisionSpi) collisionSpi;
            int parallelJobsNumber = fifo.getParallelJobsNumber();
_logger.info("FifoQueueCollisionSpi used with parallelJobsNumber:" + parallelJobsNumber);
        } else {
_logger.info("CollisionSpi is not FifoQueueCollisionSpi but:" + collisionSpi.getClass().getSimpleName());
        }

And in the logs I see:
FifoQueueCollisionSpi used with parallelJobsNumber:1

However I also see three jobs starting on the same node. The jobs can take minutes to hours to complete and unfortunately the jobs have to interact with a gui application. When multiple jobs are executed at the same time there are race conditions related to which workspace the gui application has open. Also during the job execution the gui application computes some values. If multiple computes are done at the same time the results get mixed up.

Are there known issues with FifoQueueCollisionSpi? Are there any workarounds? I'm considering adding an atomicinteger counter check in the job execution code. Do you have any suggestions? I was thinking that if I had failover setup it should be safe to fail any jobs that attempt to start concurrently.

Lastly, thanks for the hard work on Ignite (and GridGain!).

-Ryan




On 11/7/2016 6:04 PM, vkulichenko wrote:
Collision SPI is called on the node that executes the job. Having said that,
what you tell sounds a bit weird. Are you sure other nodes didn't lose the
config as well?

-Val



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Concurrent-job-execution-and-FifoQueueCollisionSpi-parallelJobsNumber-1-tp8697p8749.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Reply via email to