Hi John,

Thanks for the insight! We have image and rectilinear grids (current size is 512^3 not that big, but these are growing as we get more cpu hours), we use seed points from a plane with a higher than grid resolution, which intersects a number of sub domains,. There is some potential to integrate in parallel. At this point I am not sure it will help (and after reading your and others comments less so). There is a big serial component to the algorithm, and load is imbalanced.

Great that you found some ways to boost the performance! Any speed up will be very helpful in this application.

Burlen

John Biddiscombe wrote:
Burlen

I have had performance issues with the Distributed Stream tracer, but in fact I found that in general, the problem of it not being very well optimized for parallel operation was not the main trouble. If you are using Unstructured Grids, and they are large (in my case 20million cells in a block), then the main time was taken by the building of cell links which are used to FindCEll inwhich an integration point lies. I modified the stream tracer interpolation to use a BSP tree (or CellLocator) and found a huge improvement in execution time. (minutes instead of hours).

Secondly. the parallelization of the stream tracer is an inherent problem. One cannot integrate the streamline in block 2, until it has reached a boundary in block 1 - one must wait until the streamling traverses one block before passing it to the next. In actuality, the implementation could be improved with more intelligent seeding and rending/receiving of streamline seeds etc between iterations.

The Particle tracer code could be modifed to produce streamlines in a serial or distributed manner and ought to give a 'reasonably' optimal solution to the problem - but in fact the chaps at kitware are at the moment (they tell me) in the process of revamping the streamline code to make use of CellLocators - and for this reason I recently committed my BSP tree code.

Here's how to check your bottleneck.
Find a large StructuredGrid dataset which is loaded in parallel. Generate streamlines. Time it. Convert the grdi to UnstructuredGrid and do the same. If test 1 takes 1 minute and test 2 1 hour, then it isn't the parallization that's the real issue, but the grid being used.

JB




We've been using the distributed stream tracer to generate 100s-1000s of stream lines per time step. It's very slow, and it doesn't scale at all. The class comments say as much. I'm sure there is a reason why this implementation was chosen. Is there something that generally prevents real parallel implementation? Is there a better implementation available out there?

There is this post a while back
http://www.paraview.org/pipermail/paraview/2009-July/012959.html

What's the status?

Thanks
Burlen







_______________________________________________
Powered by www.kitware.com

Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html

Please keep messages on-topic and check the ParaView Wiki at: http://paraview.org/Wiki/ParaView

Follow this link to subscribe/unsubscribe:
http://www.paraview.org/mailman/listinfo/paraview



_______________________________________________
Powered by www.kitware.com

Visit other Kitware open-source projects at 
http://www.kitware.com/opensource/opensource.html

Please keep messages on-topic and check the ParaView Wiki at: 
http://paraview.org/Wiki/ParaView

Follow this link to subscribe/unsubscribe:
http://www.paraview.org/mailman/listinfo/paraview

Reply via email to