Hello Jinghua,
Now you did correct test, and as I suspected in my first reply
one of your machines performs bad, as they should render same
amount of data, but the data should be different for different
machines.
Regarding this difference, when you use range [0 0.25], I could
suspect following:
1) 3D textures should have size of power of 2, so if you give
size which is not it will be padded to power of 2 and some
space on your graphics card is wasted.
2) eVolve creates small borders of 1-2 voxels on the edges
between two slabs of volume for correct blending between
slabs.
Form 1) and 2) - if your [0 0.25] is exactly power of 2, 3D
texture will be actually much bigger (twice in fact) because of
the border and that texture overkill. When you render it as
range [0 1] it fits perfectly on GPU and renders faster because
borders are not added. If this is the case - simple solution
would be to render a bit smaller total volume, taking in to
account these borders.
Check from the source code which borders there are added and
try to render slightly smaller volume than exact power of 2
(or use smaller ranges, say [0 0.248] or so).
Best regards,
Makhinya Maxim
On Feb 18, 2009, at 3:18 PM, jinghua wrote:
>
> Dear Maxim and Stefan,
>
> I am doing tests today, and found out sth. which I believe is my
> problem.
>
> When I do 1 node test, I used to change the .vhf file d value to be
> 256. (I
> change d=2048 for 8 node test and d = 1024 for 4 node test)
>
> w=1024
> h=1024
> d=256
>
> and in my config file the range is [0 1]
>
> It renders at 16 fps.
>
> But today I decided that I don't want to change the .vhf file all
> the time,
> instead I change the config file. So I set d=1024 in .vhf file. For
> 4 node
> test, each node get range [0, 0.25], [0.25 0.5], [0.5, 0.75], [0.75,
> 1.0].
> With no compund it renders at 6.7 fps. Then I did the 1 node test,
> where the
> range I set as [0 0.25]. it also renders at 6.7 fps!
>
> So that must be the problem right? I am checking my rawVolModel.cpp
> file
> now, see if I did something when loading the data (I don't believe
> so tho).
> Why setting the range to be [0. 0.25] make the 1 node render slow?
>
> Yesterday I sent this post with two attachments of the 4 node and 1
> node
> statistics screenshots. Somehow it didn't get through. This time I am
> omitting the attachment.
>
> Thanks!
>
> Jinghua
>
>
>
> Maxim Makhinya wrote:
>>
>>
>> Hello Jinghua,
>>
>> Your config file seem to be right, there should be no additional
>> transfers as long as you don't have any input-output frames.
>>
>> Do you have your data locally cached? Do you load it only once
>> per-node? Might be a silly question, but, are you using
>> CXXFLAGS=-O to compile it in release not debug? =)
>>
>> What you can do else is:
>>
>> 1) check what rendering statistics show:
>> http://www.equalizergraphics.com/documents/design/statisticsOverlay.html
>> (triggered by 's' key, in eVolve and eqPly; one of your nodes
>> should be
>> described as appNode in config file, see "2-node.DB.eqc" example)
>>
>> 2) try "latancy 2" on 3 in your config{ } file
>> http://www.equalizergraphics.com/documents/design/fileFormat.html
>>
>> 3) try to render very small portion of your data "range [ 0 .
>> 001 ]", or
>> something like that, to check equalizer's overhead for your setup.
>> Rendering should be very-very fast, the only thing you will see on
>> statics is equalizer's communications.
>>
>> 4) check your network performance with "netperf" tool.
>>
>>
>> Best regards,
>>
>> Makhinya Maxim
>>
>>
>> On Feb 17, 2009, at 2:31 AM, Jinghua Ge wrote:
>>
>>> Dear Maxim,
>>>
>>> I did some tests today. Since two of my cluster nodes are not
>>> working properly, I just test with single node, 2 nodes, and 4
>>> nodes. Turns out I was wrong about 4 nodes performance before, in
>>> retrospect, I think I set the window size to be small when I did the
>>> 4nodes test. Anyway, the result I got today is:
>>>
>>> single node: 16 fps
>>> 2nodes: 10 fps
>>> 4nodes: 6 fps
>>>
>>> I also found that in my test, DS compound doesn't improve overall
>>> performance.
>>>
>>> I tried to remove the compound by commenting out all the inputframe,
>>> outframe lines in my config file. (I did remove the whole compound
>>> at first, but found out I must set the range info for each node,
>>> otherwise the data weren't distributed. )
>>>
>>> The compound looks like this:
>>>
>>> compound
>>> {
>>> channel "channel0"
>>> buffer [ COLOR DEPTH ]
>>>
>>> wall
>>> {
>>> bottom_left [ -.5 -.5 -.75 ]
>>> bottom_right [ .5 -.5 -.75 ]
>>> top_left [ -.5 .5 -.75 ]
>>> }
>>>
>>> compound
>>> {
>>> range [ 0 .25 ]
>>> }
>>> compound
>>> {
>>> channel "channel1"
>>> range [ .25 .5 ]
>>> #outputframe {}
>>> }
>>> compound
>>> {
>>> channel "channel2"
>>> range [ .5 .75 ]
>>> #outputframe {}
>>> }
>>> compound
>>> {
>>> channel "channel3"
>>> range [ .75 1 ]
>>> #outputframe {}
>>> }
>>> #inputframe { name "frame.channel1" }
>>> #inputframe { name "frame.channel2" }
>>> #inputframe { name "frame.channel3" }
>>> }
>>>
>>> The result is about 7fps. I believe the way I did the compound
>>> there are still network transfers going around, just no final
>>> compositing. But I am not sure how to disable all of the network
>>> traffic by editing the config file. Please give more advice here.
>>> Thanks!!
>>>
>>> JInghua
>>>
>>>
>>>
>>> On Mon, Feb 16, 2009 at 10:22 AM, Jinghua Ge <[email protected]>
>>> wrote:
>>> Hi Maxim,
>>>
>>> These tests you suggested in your email really make a lot sense. I
>>> will try them today and hopefully find the bottleneck. Thanks so
>>> much!
>>>
>>> Jinghua
>>>
>>>
>>> On Mon, Feb 16, 2009 at 9:58 AM, Maxim Makhinya
>>> <[email protected]> wrote:
>>>
>>> Hello Jinghua,
>>>
>>>
>>> That sound weird. Are you sure the problem is not with one of your
>>> machines?
>>> I think you should figure out first where is your performance
>>> bottleneck, and
>>> why this happens. You could try following and write back what you
>>> will
>>> get:
>>>
>>> 1) remove all compositing paths, i.e. leave only rendering. As all
>>> nodes will
>>> render the same amount of data without compositing it should not
>>> really
>>> matter how many you use - 4 nodes for 1 Gb or 8 nodes for 2 Gb.
>>> Speed
>>> should remain roughly the same as there is no pictures
>>> transferred.
>>>
>>> 2) try to split config in to two independent parts - i.e. 4 nodes
>>> for
>>> first
>>> 1 Gb of data, another 4 nodes for second Gb, without final
>>> compositing of
>>> this two parts. Again, it should be symmetric and speed shouldn't
>>> change
>>> much.
>>>
>>> 3) if in the second path you will get your 10-12 fps, than just
>>> try to
>>> combine
>>> those results in one additional compositing on top.
>>>
>>>
>>> Best regards,
>>>
>>> Makhinya Maxim
>>>
>>>
>>> On Feb 16, 2009, at 4:01 PM, jinghua wrote:
>>>
>>>>
>>>> Dear Stefan,
>>>>
>>>> I have tried Equalizer's eVolve volume renderer to render a
>>>> 1kx1kx2k, ubyte
>>>> volume over a remote 8-node cluster. I have changed the evolve code
>>>> and
>>>> shader to read in the original volume with one byte per voxel, and
>>>> it all
>>>> worked fine. My cluster has a Nvidia GeForce-9500 card with 1G
>>>> memory on
>>>> each node, and infiband private network among the nodes. I used
>>>> direct send
>>>> compound. Each node get 1kx1kx256 subvolume. Each node renders the
>>>> 256M
>>>> volume locally at 16-20fps. When I used 4nodes to render the 1G
>>>> volume, the
>>>> overall performance is about 10-12 fps. But when I used 8nodes to
>>>> render the
>>>> whole 2G volume, the frame rate drops down to 2fps. I have tried to
>>>> use both
>>>> DB and direct send compound, with ethernet and IB network, it's all
>>>> very
>>>> consistent 2fps performance. Are there something I can do to
>>>> improve
>>>> the
>>>> performance? Thanks a lot!
>>>>
>>>> Jinghua http://n2.nabble.com/file/n2335264/test-8node.res.infi
>>>> test-8node.res.infi
>>>>
>>>> Attached is my config file.
>>>> --
>>>> View this message in context:
>>> http://n2.nabble.com/eVolve-render-2G-volume-on-8nodes-at-2fps-tp2335264p2335264.html
>>>> Sent from the Equalizer - Parallel Rendering mailing list archive
>>>> at
>>>> Nabble.com.
>>>>
>>>>
>>>> _______________________________________________
>>>> eq-dev mailing list
>>>> [email protected]
>>>> http://www.equalizergraphics.com/cgi-bin/mailman/listinfo/eq-dev
>>>> http://www.equalizergraphics.com
>>>>
>>>
>>>
>>> _______________________________________________
>>> eq-dev mailing list
>>> [email protected]
>>> http://www.equalizergraphics.com/cgi-bin/mailman/listinfo/eq-dev
>>> http://www.equalizergraphics.com
>>>
>>>
>>>
>>> --
>>> Jinghua Ge, Ph.D
>>> Visualization Consultant, CCT
>>> 331 Frey Computing Services Center
>>> Louisiana State University
>>> Phone: (225) 578-7789
>>> Fax: (225) 334-2061
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Jinghua Ge, Ph.D
>>> Visualization Consultant, CCT
>>> 331 Frey Computing Services Center
>>> Louisiana State University
>>> Phone: (225) 578-7789
>>> Fax: (225) 334-2061
>>>
>>>
>>> _______________________________________________
>>> eq-dev mailing list
>>> [email protected]
>>> http://www.equalizergraphics.com/cgi-bin/mailman/listinfo/eq-dev
>>> http://www.equalizergraphics.com
>>
>>
>> _______________________________________________
>> eq-dev mailing list
>> [email protected]
>> http://www.equalizergraphics.com/cgi-bin/mailman/listinfo/eq-dev
>> http://www.equalizergraphics.com
>>
>>
>
> --
> View this message in context:
> http://n2.nabble.com/eVolve-render-2G-volume-on-8nodes-at-2fps-tp2335264p2347050.html
> Sent from the Equalizer - Parallel Rendering mailing list archive at
> Nabble.com.
>
>
> _______________________________________________
> eq-dev mailing list
> [email protected]
> http://www.equalizergraphics.com/cgi-bin/mailman/listinfo/eq-dev
> http://www.equalizergraphics.com
>
_______________________________________________
eq-dev mailing list
[email protected]
http://www.equalizergraphics.com/cgi-bin/mailman/listinfo/eq-dev
http://www.equalizergraphics.com