Re: [Pharo-dev] Breaking the 4GB barrier with Pharo 6 64-bit

[email protected] Wed, 23 Nov 2016 11:12:57 -0800

On Wed, Nov 23, 2016 at 4:16 PM, Thierry Goubier <[email protected]>
wrote:


>
>
> 2016-11-23 15:46 GMT+01:00 [email protected] <[email protected]>:
>
>> Thanks Thierry.
>>
>> Please also see that with new satellites, the resolution is ever
>> increasing (e.g. Sentinel http://m.esa.int/Our_
>> Activities/Observing_the_Earth/Copernicus/Overview4)
>>
>
> It has allways been so. Anytime you reach a reasonable size, they send a
> new satellite with higher res / larger images :)
>
>
>>
>> I understand the tile thing and indeed a lot of the algos work on tiles,
>> but there are other ways to do this and especially with real time geo
>> queries on custom defined polygons, you go only so far with tiles. A reason
>> why we are using GeoTrellis backed by Accumulo in order to pump data very
>> fast in random order.
>>
>
> But that mean you're dealing with preprocessed / graph georeferenced data
> (aka openstreetmap type of data). If you're dealing with raster, your
> polygons are approximated by a set of tiles (with a nice tile size well
> suited to your network / disk array).
>
> I had reasonable success a long time ago (1991, I think), for Ifremer,
> with an unbalanced, sort of quadtree based decomposition for highly
> irregular curves on the seabed. Tree node size / tile size was computed to
> be exactly equal to the disk block size on a very slow medium. That sort of
> work is in the line of a geographic index for a database: optimise query
> accesses to geo-referenced objects... what is hard, and probably what you
> are doing, is combining geographic queries with graph queries (give me all
> houses in Belgium within a ten minutes bus + walk trip to a primary
> school)(*)
>
> (*) One can work that out on a raster for speed. This is what GRASS does
> for example.
>
> (**) I asked a student to accelerate some raster processing on a very
> small FPGA a long time ago. Once he had understood he could pipeline the
> design to increase the frequency, he then discovered that the FPGA would
> happily grok data faster than the computer bus could provide it :) leaving
> no bandwith for the data to be written back to memory.
>

Yes, but network can be pretty fast with bonded Ethernet interfaces these
days.

>
>
>>
>> We are adding 30+ servers to the cluster at the moment just to deal with
>> the sizes as there is a project mapping energy landscape
>> https://vito.be/en/land-use/land-use/energy-landscapes. This thing is
>> throwing YARN containers and uses CPU like, intensively. It is not uncommon
>> for me to see their workload eating everything for a serious amount of CPU
>> seconds.
>>
>
> Only a few seconds ?
>

CPU-seconds, that the cluster usage unit for CPU.
http://serverfault.com/questions/138703/a-definition-for-a-cpu-second
So, says couple millions of them on a 640 core setup. CPU power is the
limiting factor in these workloads it seems.

>
>
>>
>> It would be silly not to plug Pharo into all of this infrastructure I
>> think.
>>
>
> I've had quite bad results with Pharo on compute intensive code recently,
> so I'd plan carefully how I use it. On that sort of hardware, in the
> projects I'm working on, 1000x faster than Pharo on a single node is about
> an expected target.
>

Sure, but lower level C/C++ things are run from Python or Java, so Pharo
will not do worse. The good bit about Pharo is that one can ship a
preloaded image and that is easier than sending gigabyte (!) sized uberjars
around, that Java will unzip before running, also true with Python myriad
of dependencies. An image file appears super small then.

>
>
>>
>> Especially given the PhD/Postdoc/brainiacs per square meter there. If you
>> have seen the Lost TV show, well, it kind of feels working there at that
>> place. Especially given that is is kind of hidden in the woods.
>>
>> Maybe you could have interesting interactions with them. These guys also
>> have their own nuclear reactor and geothermal drilling.
>>
>
> I'd be interested, because we're working a bit on high performance
> parallel runtimes and compilation for those. If one day you happen to be
> ready to talk about it in our place? South of Paris, not too hard to reach
> by public transport :)
>
> Sure, that would be awesome. But Q1Y17 then because my schedule is pretty
packed at the moment. I can show you the thing over the web from my side,
so you can see where are in terms of systems. I guess you are much more
advanced but one of the goals of the project here is to be pretty
approachable and gather a community that will cross pollinate algos and
datasets for network effects.

Phil


> Thierry
>
>
>
>> Phil
>>
>>
>>
>> On Wed, Nov 23, 2016 at 1:30 PM, Thierry Goubier <
>> [email protected]> wrote:
>>
>>> Hi Phil,
>>>
>>> 2016-11-23 12:17 GMT+01:00 [email protected] <
>>> [email protected]>:
>>>
>>>> [ ...]
>>>>
>>>> It is really important to have such features to avoid massive GC pauses.
>>>>
>>>> My use case is to load the data sets from here.
>>>> https://www.google.be/url?sa=t&source=web&rct=j&url=http://p
>>>> roba-v.vgt.vito.be/sites/default/files/Product_User_Manual.p
>>>> df&ved=0ahUKEwjwlOG-4L7QAhWBniwKHZVmDZcQFggpMAI&usg=AFQjCNGR
>>>> ME9ZyHWQ8yCPgAQBDi1PUmzhbQ&sig2=eyaT4DlWCTjqUdQGBhFY0w
>>>>
>>> I've used that type of data before, a long time ago.
>>>
>>> I consider that tiled / on-demand block loading is the way to go for
>>> those. Work with the header as long as possible, stream tiles if you need
>>> to work on the full data set. There is a good chance that:
>>>
>>> 1- You're memory bound for anything you compute with them
>>> 2- I/O times dominates, or become low enough to don't care (very fast
>>> SSDs)
>>> 3- It's very rare that you need full random access on the complete array
>>> 4- GC doesn't matter
>>>
>>> Stream computing is your solution! This is how the raster GIS are
>>> implemented.
>>>
>>> What is hard for me is manipulating a very large graph, or a sparse very
>>> large structure, like a huge Famix model or a FPGA layout model with a full
>>> design layed out on top. There, you're randomly accessing the whole of the
>>> structure (or at least you see no obvious partition) and the structure is
>>> too large for the memory or the GC.
>>>
>>> This is why I had a long time ago this idea of a in-memory working-set /
>>> on-disk full structure with automatic determination of what the working set
>>> is.
>>>
>>> For pointers, have a look at the Graph500 and HPCG benchmarks,
>>> especially the efficiency (ratio to peak) of HPCG runs, to see how
>>> difficult these cases are.
>>>
>>> Regards,
>>>
>>> Thierry
>>>
>>
>>
>

Re: [Pharo-dev] Breaking the 4GB barrier with Pharo 6 64-bit

Reply via email to