Re: [9fans] threads vs forks

2009-03-07 Thread Bakul Shah
On Fri, 06 Mar 2009 12:38:57 PST David Leimbach leim...@gmail.com  wrote:
 
 Things like Clojure, or Scala become a bit more interesting when the VM is
 extended to allow tail recursion to happen in a nice way.

A lack of TCO is not something that will prevent you from
writing many interesting programs (except things like a state
machine as a set of mutually calling functions!).

There is nothing in Clojure, or C for that matter, that will
disallow tail call optimization should an implemention
provide it.  It is just that unlike Scheme most programming
languages do not *mandate* that tail calls be optimized.



Re: [9fans] threads vs forks

2009-03-07 Thread erik quanstrom
On Sat Mar  7 01:02:31 EST 2009, j...@eecs.harvard.edu wrote:
 On Fri, Mar 06, 2009 at 10:31:59PM -0500, erik quanstrom wrote:
  it's interesting to note that the quoted mtbf numbers for ssds is
  within a factor of 2 of enterprise hard drives.  if one considers that
  one needs ~4 ssds to cover the capacity of 1 hard drive, the quoted
  mtbf/byte is worse for ssd.
 
 That's only if you think of flash as a direct replacement for disk.

i think that's why they put them in a 2.5 form factor with a standard
SATA interface.  what are you thinking of?

 SSDs are expensive on a $/MB basis compared to disks.  The good ones

not as much as you think.  a top-drawer 15k sas drive is on the order
of 300GB and $350+.  the intel ssd is only twice as much.  if you compare
the drives supported by the big-iron vendors, intel ssd already has cost
parity.

 For short-lived data you only need go over the I/O bus twice vs. three
 times for most NVRAMs based on battery-backed DRAM.

i'm missing something here.  what are your assumptions
on how things are connected?  also, isn't there an assumption
that you don't want to be writing short-lived data to flash if
possible?

- erik



Re: [9fans] threads vs forks

2009-03-07 Thread erik quanstrom
On Sat Mar  7 09:39:38 EST 2009, j...@eecs.harvard.edu wrote:
 On Sat, Mar 07, 2009 at 08:58:42AM -0500, erik quanstrom wrote:
  i think that's why they put them in a 2.5 form factor with a standard
  SATA interface.  what are you thinking of?
 
 No, the reason they do that is for backwards compatibility.

it's kind of funny to call sata backwards compatability.  if
things go as you suggest — pcie connected, i think we'll all
long for the day when we could write one driver per hba rather
than one driver per storage device.

new boss, same as the old boss.

   SSDs are expensive on a $/MB basis compared to disks.  The good ones
  
  not as much as you think.  a top-drawer 15k sas drive is on the order
  of 300GB and $350+.  the intel ssd is only twice as much.  if you compare
  the drives supported by the big-iron vendors, intel ssd already has cost
  parity.
 
 The Intel SSD is cheap and slow :-)

pick a lane!  first you argued that they are expensive. ☺

 Take a gander at the NetApp NAS filers or DataDomain restorers.

so you're saying that these machines don't differentiate between
primary cache and their write log (or whatever they call it)?

 My point isn't that it is a bad idea, just that it isn't
 likely to provide enough business to keep manufacturers
 interested.  Moreover, for capacity disks will keep on
 winning for a long time.  They just start to look more
 and more like tape.

no.  i agree.  worm storage in general is not a popular topic,
but the few companies that do use it pay the big bucks for it.

it's always great when the backup media is less reliable
than the primary media.

- erik



Re: [9fans] threads vs forks

2009-03-06 Thread Roman V Shaposhnik
Clojure is definitely something that I would like to play
with extensively. Looks very promising from the outset,
so the only question that I have is how does it feel
when used for substantial things.

Thanks,
Roman.

P.S. My belief in it was actually reaffirmed by a raving
endorsement it got from an old LISP community. Those
guys are a bit like 9fans, if you know what I mean ;-)


On Tue, 2009-03-03 at 10:38 -0800, Bakul Shah wrote:
 On Tue, 03 Mar 2009 10:11:10 PST Roman V. Shaposhnik r...@sun.com  wrote:
  On Tue, 2009-03-03 at 07:19 -0800, David Leimbach wrote:
  
   My knowledge on this subject is about 8 or 9 years old, so check with 
   your 
  local Python guru
   
   
   The last I'd heard about Python's threading is that it was cooperative
   only, and that you couldn't get real parallelism out of it.  It serves
   as a means to organize your program in a concurrent manner.  
   
   
   In other words no two threads run at the same time in Python, even if
   you're on a multi-core system, due to something they call a Global
   Interpreter Lock.  
  
  I believe GIL is as present in Python nowadays as ever. On a related
  note: does anybody know any sane interpreted languages with a decent
  threading model to go along? Stackless python is the only thing that
  I'm familiar with in that department.
 
 Depend on what you mean by sane interpreted language with a
 decent threading model and what you want to do with it but
 check out www.clojure.org.  Then there is Erlang.  Its
 wikipedia entry has this to say:
 Although Erlang was designed to fill a niche and has
 remained an obscure language for most of its existence,
 it is experiencing a rapid increase in popularity due to
 increased demand for concurrent services, inferior models
 of concurrency in most mainstream programming languages,
 and its substantial libraries and documentation.[7][8]
 Well-known applications include Amazon SimpleDB,[9]
 Yahoo! Delicious,[10] and the Facebook Chat system.[11]
 




Re: [9fans] threads vs forks

2009-03-06 Thread David Leimbach
Things like Clojure, or Scala become a bit more interesting when the VM is
extended to allow tail recursion to happen in a nice way.

On Fri, Mar 6, 2009 at 10:47 AM, Roman V Shaposhnik r...@sun.com wrote:

 Clojure is definitely something that I would like to play
 with extensively. Looks very promising from the outset,
 so the only question that I have is how does it feel
 when used for substantial things.

 Thanks,
 Roman.

 P.S. My belief in it was actually reaffirmed by a raving
 endorsement it got from an old LISP community. Those
 guys are a bit like 9fans, if you know what I mean ;-)


 On Tue, 2009-03-03 at 10:38 -0800, Bakul Shah wrote:
  On Tue, 03 Mar 2009 10:11:10 PST Roman V. Shaposhnik r...@sun.com
  wrote:
   On Tue, 2009-03-03 at 07:19 -0800, David Leimbach wrote:
  
My knowledge on this subject is about 8 or 9 years old, so check with
 your
   local Python guru
   
   
The last I'd heard about Python's threading is that it was
 cooperative
only, and that you couldn't get real parallelism out of it.  It
 serves
as a means to organize your program in a concurrent manner.
   
   
In other words no two threads run at the same time in Python, even if
you're on a multi-core system, due to something they call a Global
Interpreter Lock.
  
   I believe GIL is as present in Python nowadays as ever. On a related
   note: does anybody know any sane interpreted languages with a decent
   threading model to go along? Stackless python is the only thing that
   I'm familiar with in that department.
 
  Depend on what you mean by sane interpreted language with a
  decent threading model and what you want to do with it but
  check out www.clojure.org.  Then there is Erlang.  Its
  wikipedia entry has this to say:
  Although Erlang was designed to fill a niche and has
  remained an obscure language for most of its existence,
  it is experiencing a rapid increase in popularity due to
  increased demand for concurrent services, inferior models
  of concurrency in most mainstream programming languages,
  and its substantial libraries and documentation.[7][8]
  Well-known applications include Amazon SimpleDB,[9]
  Yahoo! Delicious,[10] and the Facebook Chat system.[11]
 





Re: [9fans] threads vs forks

2009-03-06 Thread Bakul Shah
On Fri, 06 Mar 2009 10:47:20 PST Roman V Shaposhnik r...@sun.com  wrote:
 Clojure is definitely something that I would like to play
 with extensively. Looks very promising from the outset,
 so the only question that I have is how does it feel
 when used for substantial things.

You can browse various Clojure related google groups but
there is only one way to find out if it is for you!

 P.S. My belief in it was actually reaffirmed by a raving
 endorsement it got from an old LISP community. Those
 guys are a bit like 9fans, if you know what I mean ;-)

No comment :-)



Re: [9fans] threads vs forks

2009-03-06 Thread Brian L. Stuart
 P.S. My belief in it was actually reaffirmed by a raving
 endorsement it got from an old LISP community. Those
 guys are a bit like 9fans, if you know what I mean ;-)

You mean intelligent people who appreciate elegance? :)

Sorry.  Couldn't resist.

BLS




Re: [9fans] threads vs forks

2009-03-06 Thread erik quanstrom
 To be less flippant, what makes high performance flash difficult
 is the slow erasure time and large erasure blocks relative to
 the size of individual flash pages.  Being full hurts since the
 flash is typically managed by a log structured storage system
 with a garbage collector.  Small random writes require updating
 the logical-physical mapping efficiently and crash recoverably.
 You also need to do copy-on-write which leads to what is commonly
 called write amplification, which reduces the usuable number of
 writes.  Small writes tend to exacerbate a lot of these problems.
 
 Where does all this fancy stuff belong?  In the storage medium,
 in the HBA, in the device driver, in the file system, or in the
 application?

it's interesting to note that the quoted mtbf numbers for ssds is
within a factor of 2 of enterprise hard drives.  if one considers that
one needs ~4 ssds to cover the capacity of 1 hard drive, the quoted
mtbf/byte is worse for ssd.

the obvious conclusion is that if you think you need raid for hard
drives, then you also need raid for ssds.  at least if you believe the
mtbf numbers.

i think that it's a real good question where the fancy flash
tricks belong.  the naive guess would be that for backwards compatability
reasons, the media will get much of the smarts.  

- erik



Re: [9fans] threads vs forks

2009-03-06 Thread lucio
 Where does all this fancy stuff belong?  In the storage medium,
 in the HBA, in the device driver, in the file system, or in the
 application?

In a very intelligent cache?  Or did you mention that above and in my
ignorance I missed it?

OK, let's try this:

. Storage medium: only the hardware developers have access to that and
  they have never seemed interested in matching anyone else's
  requirements or suggestions.

. The HBA (?).  If that's the device adapter, the same applies as
  above.

. The device driver should not be very complex and the block handling
  should hopefully be shared by more than one device driver, which
  with the effective demise of Streams is not a very easy thing to
  implement without resorting to jumping through flaming hoops.

. The application?  That's being facetious, surely?

. A cache?  As quanstro pointed out, flash makes a wonderful WORM.
  Now to get Fossil to work as originally intended, or a more suitable
  design and implementation to take its place in this role and we have
  a winner.

++L




Re: [9fans] threads vs forks

2009-03-06 Thread lucio
 Much of the intelligence
 actually resides in the device driver.  It is that secret sauce
 that gets you good performance.  In theory it could be pushed
 down, but it takes CPU, memory, and memory bandwidth that may
 not be cost effective there.

That would entail a really intelligent controller, which brings us
back to a cache, does it not, this time hidden inside a black box.  I
have been thinking that the obsession with SMP has a negative impact
on diverse engineering where intelligent peripherals take over
operations that are too slow or too demanding on the generic CPU.
Smacks of AoE to me, with a lot more packed into the A.

But I'm just an old software developer with a hobbyist interest in
electronic engineering and my opinions are not backed by much
research.

++L




Re: [9fans] threads vs forks

2009-03-06 Thread erik quanstrom
 Sadly, if a WORM is your only application, then no one cares.
 At least not enough to pony up for real peformance.  The folks
 at places like Sandia are interested in running HPC applications
 and there are a lot of people in other industries such as big oil
 and finance that are willing to pay for performance for running
 HPC applications, VMs which tend to have high I/O requirements when
 an OS patch comes out, etc.

ask not what a technology can do for the world,
ask what a technology can do for you!

- erik



Re: [9fans] threads vs forks

2009-03-05 Thread maht



That's a fact. If you have access to The ACM Queue, check out
p16-cantrill-concurrency.pdf (Cantrill and Bonwich on concurrency).
  
Or you can rely on one of the hackish attempts at email attachment 
management or whatever conceptual error lead to this :


https://agora.cs.illinois.edu/download/attachments/18744240/p16-cantrill.pdf?version=1


courtesy of a google datacentre near you





Re: [9fans] threads vs forks

2009-03-04 Thread Vincent Schut

John Barham wrote:

On Tue, Mar 3, 2009 at 3:52 AM, hugo rivera uai...@gmail.com wrote:


I have to launch many tasks running in parallel (~5000) in a
cluster running linux. Each of the task performs some astronomical
calculations and I am not pretty sure if using fork is the best answer
here.
First of all, all the programming is done in python and c...


Take a look at the multiprocessing package
(http://docs.python.org/library/multiprocessing.html), newly
introduced with Python 2.6 and 3.0:

multiprocessing is a package that supports spawning processes using
an API similar to the threading module. The multiprocessing package
offers both local and remote concurrency, effectively side-stepping
the Global Interpreter Lock by using subprocesses instead of threads.

It should be a quick and easy way to set up a cluster-wide job
processing system (provided all your jobs are driven by Python).


Better: use parallelpython (www.parallelpython.org). Afaik 
multiprocessing is geared towards multi-core systems (one machine), 
while pp is also suitable for real clusters with more pc's. No special 
cluster software needed. It will start (here's your fork) a (some) 
python interpreters on each node, and then you can submit jobs to those 
'workers'. The interpreters are kept alive between jobs, so the startup 
penalty becomes neglectibly when the number of jobs is large enough.
Using it here to process massive amounts of satellite data, works like a 
charm.


Vincent.


It also looks like it's been (partially?) back-ported to Python 2.4
and 2.5: http://pypi.python.org/pypi/processing.

  John







Re: [9fans] threads vs forks

2009-03-04 Thread hugo rivera
Thanks for the advice.
Nevertheless I am in no position to decide what pieces of software the
cluster will run, I just have to deal with what I have, but anyway I
can suggest other possibilities.

2009/3/4, Vincent Schut sc...@sarvision.nl:
 John Barham wrote:

  On Tue, Mar 3, 2009 at 3:52 AM, hugo rivera uai...@gmail.com wrote:
 
 
   I have to launch many tasks running in parallel (~5000) in a
   cluster running linux. Each of the task performs some astronomical
   calculations and I am not pretty sure if using fork is the best answer
   here.
   First of all, all the programming is done in python and c...
  
 
  Take a look at the multiprocessing package
  (http://docs.python.org/library/multiprocessing.html),
 newly
  introduced with Python 2.6 and 3.0:
 
  multiprocessing is a package that supports spawning processes using
  an API similar to the threading module. The multiprocessing package
  offers both local and remote concurrency, effectively side-stepping
  the Global Interpreter Lock by using subprocesses instead of threads.
 
  It should be a quick and easy way to set up a cluster-wide job
  processing system (provided all your jobs are driven by Python).
 

  Better: use parallelpython (www.parallelpython.org). Afaik multiprocessing
 is geared towards multi-core systems (one machine), while pp is also
 suitable for real clusters with more pc's. No special cluster software
 needed. It will start (here's your fork) a (some) python interpreters on
 each node, and then you can submit jobs to those 'workers'. The interpreters
 are kept alive between jobs, so the startup penalty becomes neglectibly when
 the number of jobs is large enough.
  Using it here to process massive amounts of satellite data, works like a
 charm.

  Vincent.


 
  It also looks like it's been (partially?) back-ported to Python 2.4
  and 2.5: http://pypi.python.org/pypi/processing.
 
   John
 
 
 





-- 
Hugo



Re: [9fans] threads vs forks

2009-03-04 Thread Vincent Schut

hugo rivera wrote:

Thanks for the advice.
Nevertheless I am in no position to decide what pieces of software the
cluster will run, I just have to deal with what I have, but anyway I
can suggest other possibilities.


Well, depends on how you define 'software the cluster will run'. Do you 
mean cluster management software, or really any program or script or 
python module that needs to be installed on each node? Because for pp, 
you won't need any cluster software. pp is just some python module and 
helper scripts. You *do* need to install this (pure python) module on 
each node, yes, but that's it, nothing else needed.
Btw, you said 'it's a small cluster, about 6 machines'. Now I'm not an 
expert, but I don't think you can do threading/forking from one machine 
to another (on linux). So I suppose there already is some cluster 
management software involved? And while you appear to be in no position 
to decide what pieces of software the cluster will run, you might want 
to enlighten us on what this cluster /will/ run? Your best solution 
might depend on that...


Cheers,
Vincent.




Re: [9fans] threads vs forks

2009-03-04 Thread Vincent Schut

hugo rivera wrote:

The cluster has torque installed as the resource manager. I think it
runs of top of pbs (an older project).
As far as I know now I just have to call a qsub command to submit my
jobs on a queue, then the resource manager allocates a processor in
the cluster for my process to run till is finished.


Well, I don't know torque neither pbs, but I'm guessing that when you 
submit a job, this job will be some program or script that is run on the 
allocated processor? If so, your initial question of forking vs 
threading is bogus. Your cluster manager will run (exec) your job, which 
if it is a python script will start a python interpreter for each job. I 
guess that's the overhead you get when running a flexible cluster 
system, flexible meaning that it can run any type of job (shell script, 
binary executable, python script, perl, etc.).
However, your overhead of starting new python processes each time may 
seem significant when viewed in absolute terms, but if each job 
processes lots of data and takes, as you said, 5 min to run on a decent 
processor, don't you think the startup time for the python process would 
become non-significant? For example, on a decent machine here, the first 
time python takes 0.224 secs to start and shutdown immediately, and 
consequetive starts take only about 0.009 secs because everything is 
still in memory. Let's take the 0.224 secs for a worst case scenario. 
That would be approx 0.075 percent of your job execution time. Now lets 
say you have 6 machines with 8 cores each and perfect scaling, all your 
jobs would take 6000 / (6*8) *5min = 625 minutes (10 hours 25 mins) 
without python starting each time, and 625 minutes and 28 seconds with 
python starting anew each job. Don't you think you could just live with 
these 28 seconds more? Just reading this message might already have 
taken you more than those 28 seconds...


Vincent.


And I am not really sure if I have access to all the nodes, so I can
install pp on each one of them.

2009/3/4, Vincent Schut sc...@sarvision.nl:

hugo rivera wrote:


Thanks for the advice.
Nevertheless I am in no position to decide what pieces of software the
cluster will run, I just have to deal with what I have, but anyway I
can suggest other possibilities.


 Well, depends on how you define 'software the cluster will run'. Do you
mean cluster management software, or really any program or script or python
module that needs to be installed on each node? Because for pp, you won't
need any cluster software. pp is just some python module and helper scripts.
You *do* need to install this (pure python) module on each node, yes, but
that's it, nothing else needed.
 Btw, you said 'it's a small cluster, about 6 machines'. Now I'm not an
expert, but I don't think you can do threading/forking from one machine to
another (on linux). So I suppose there already is some cluster management
software involved? And while you appear to be in no position to decide what
pieces of software the cluster will run, you might want to enlighten us on
what this cluster /will/ run? Your best solution might depend on that...

 Cheers,
 Vincent.











Re: [9fans] threads vs forks

2009-03-04 Thread hugo rivera
you are right. I was totally confused at the beggining.
Thanks a lot.

2009/3/4, Vincent Schut sc...@sarvision.nl:
 hugo rivera wrote:

  The cluster has torque installed as the resource manager. I think it
  runs of top of pbs (an older project).
  As far as I know now I just have to call a qsub command to submit my
  jobs on a queue, then the resource manager allocates a processor in
  the cluster for my process to run till is finished.
 

  Well, I don't know torque neither pbs, but I'm guessing that when you
 submit a job, this job will be some program or script that is run on the
 allocated processor? If so, your initial question of forking vs threading is
 bogus. Your cluster manager will run (exec) your job, which if it is a
 python script will start a python interpreter for each job. I guess that's
 the overhead you get when running a flexible cluster system, flexible
 meaning that it can run any type of job (shell script, binary executable,
 python script, perl, etc.).
  However, your overhead of starting new python processes each time may seem
 significant when viewed in absolute terms, but if each job processes lots of
 data and takes, as you said, 5 min to run on a decent processor, don't you
 think the startup time for the python process would become non-significant?
 For example, on a decent machine here, the first time python takes 0.224
 secs to start and shutdown immediately, and consequetive starts take only
 about 0.009 secs because everything is still in memory. Let's take the 0.224
 secs for a worst case scenario. That would be approx 0.075 percent of your
 job execution time. Now lets say you have 6 machines with 8 cores each and
 perfect scaling, all your jobs would take 6000 / (6*8) *5min = 625 minutes
 (10 hours 25 mins) without python starting each time, and 625 minutes and 28
 seconds with python starting anew each job. Don't you think you could just
 live with these 28 seconds more? Just reading this message might already
 have taken you more than those 28 seconds...

  Vincent.



  And I am not really sure if I have access to all the nodes, so I can
  install pp on each one of them.
 
  2009/3/4, Vincent Schut sc...@sarvision.nl:
 
   hugo rivera wrote:
  
  
Thanks for the advice.
Nevertheless I am in no position to decide what pieces of software the
cluster will run, I just have to deal with what I have, but anyway I
can suggest other possibilities.
   
   
Well, depends on how you define 'software the cluster will run'. Do you
   mean cluster management software, or really any program or script or
 python
   module that needs to be installed on each node? Because for pp, you
 won't
   need any cluster software. pp is just some python module and helper
 scripts.
   You *do* need to install this (pure python) module on each node, yes,
 but
   that's it, nothing else needed.
Btw, you said 'it's a small cluster, about 6 machines'. Now I'm not an
   expert, but I don't think you can do threading/forking from one machine
 to
   another (on linux). So I suppose there already is some cluster
 management
   software involved? And while you appear to be in no position to decide
 what
   pieces of software the cluster will run, you might want to enlighten us
 on
   what this cluster /will/ run? Your best solution might depend on that...
  
Cheers,
Vincent.
  
  
  
  
 
 
 





-- 
Hugo



Re: [9fans] threads vs forks

2009-03-04 Thread Uriel
What about xcpu?


On Wed, Mar 4, 2009 at 12:33 PM, hugo rivera uai...@gmail.com wrote:
 you are right. I was totally confused at the beggining.
 Thanks a lot.

 2009/3/4, Vincent Schut sc...@sarvision.nl:
 hugo rivera wrote:

  The cluster has torque installed as the resource manager. I think it
  runs of top of pbs (an older project).
  As far as I know now I just have to call a qsub command to submit my
  jobs on a queue, then the resource manager allocates a processor in
  the cluster for my process to run till is finished.
 

  Well, I don't know torque neither pbs, but I'm guessing that when you
 submit a job, this job will be some program or script that is run on the
 allocated processor? If so, your initial question of forking vs threading is
 bogus. Your cluster manager will run (exec) your job, which if it is a
 python script will start a python interpreter for each job. I guess that's
 the overhead you get when running a flexible cluster system, flexible
 meaning that it can run any type of job (shell script, binary executable,
 python script, perl, etc.).
  However, your overhead of starting new python processes each time may seem
 significant when viewed in absolute terms, but if each job processes lots of
 data and takes, as you said, 5 min to run on a decent processor, don't you
 think the startup time for the python process would become non-significant?
 For example, on a decent machine here, the first time python takes 0.224
 secs to start and shutdown immediately, and consequetive starts take only
 about 0.009 secs because everything is still in memory. Let's take the 0.224
 secs for a worst case scenario. That would be approx 0.075 percent of your
 job execution time. Now lets say you have 6 machines with 8 cores each and
 perfect scaling, all your jobs would take 6000 / (6*8) *5min = 625 minutes
 (10 hours 25 mins) without python starting each time, and 625 minutes and 28
 seconds with python starting anew each job. Don't you think you could just
 live with these 28 seconds more? Just reading this message might already
 have taken you more than those 28 seconds...

  Vincent.



  And I am not really sure if I have access to all the nodes, so I can
  install pp on each one of them.
 
  2009/3/4, Vincent Schut sc...@sarvision.nl:
 
   hugo rivera wrote:
  
  
Thanks for the advice.
Nevertheless I am in no position to decide what pieces of software the
cluster will run, I just have to deal with what I have, but anyway I
can suggest other possibilities.
   
   
    Well, depends on how you define 'software the cluster will run'. Do you
   mean cluster management software, or really any program or script or
 python
   module that needs to be installed on each node? Because for pp, you
 won't
   need any cluster software. pp is just some python module and helper
 scripts.
   You *do* need to install this (pure python) module on each node, yes,
 but
   that's it, nothing else needed.
    Btw, you said 'it's a small cluster, about 6 machines'. Now I'm not an
   expert, but I don't think you can do threading/forking from one machine
 to
   another (on linux). So I suppose there already is some cluster
 management
   software involved? And while you appear to be in no position to decide
 what
   pieces of software the cluster will run, you might want to enlighten us
 on
   what this cluster /will/ run? Your best solution might depend on that...
  
    Cheers,
    Vincent.
  
  
  
  
 
 
 





 --
 Hugo





Re: [9fans] threads vs forks

2009-03-04 Thread ron minnich
On Wed, Mar 4, 2009 at 2:30 AM, Vincent Schut sc...@sarvision.nl wrote:
 hugo rivera wrote:
Now I'm not an
 expert, but I don't think you can do threading/forking from one machine to
 another (on linux).

You can with bproc, but it's not supported past 2.6.21 or so.

ron



Re: [9fans] threads vs forks

2009-03-04 Thread Roman V Shaposhnik
On Tue, 2009-03-03 at 23:24 -0600, blstu...@bellsouth.net wrote:
  it's interesting that parallel wasn't cool when chips were getting
  noticably faster rapidly.  perhaps the focus on parallelization
  is a sign there aren't any other ideas.
 
 Gotta do something will all the extra transistors.  After all, Moore's
 law hasn't been repealed.  And pipelines and traditional caches
 are pretty good examples of dimishing returns.  So multiple cores
 seems a pretty straightforward approach.

Our running joke circa '05 was that the industry was suffering from
the transistor overproduction crisis. One only needs to look at other
overproduction crisis (especially the food industry) to appreciate
the similarities.

 Now there is another use that would at least be intellectually interesting
 and possible useful in practice.  Use the transistors for a really big
 memory running at cache speed.  But instead of it being a hardware
 cache, manage it explicitly.  In effect, we have a very high speed
 main memory, and the traditional main memory is backing store.
 It'd give a use for all those paging algorithms that aren't particularly
 justified at the main memory-disk boundary any more.  And you
 can fit a lot of Plan 9 executable images in a 64MB on-chip memory
 space.  Obviously, it wouldn't be a good fit for severely memory-hungry
 apps, and it might be a dead end overall, but it'd at least be something
 different...

One could argue that transactional memory model is supposed to be
exactly that.

Thanks,
Roman.




Re: [9fans] threads vs forks

2009-03-04 Thread J.R. Mauro
On Wed, Mar 4, 2009 at 12:50 AM, erik quanstrom quans...@quanstro.net wrote:
 
  Both AMD and Intel are looking at I/O because it is and will be a limiting
  factor when scaling to higher core counts.

 i/o starts sucking wind with one core.
 that's why we differentiate i/o from everything
 else we do.

 And soon hard disk latencies are really going to start hurting (they
 already are hurting some, I'm sure), and I'm not convinced of the
 viability of SSDs.

 i'll assume you mean throughput.  hard drive latency has been a big deal
 for a long time.  tanenbaum integrated knowledge of track layout into
 his minix elevator algorithm.

Yes, sorry.


 i think the gap between cpu performance and hd performance is narrowing,
 not getting wider.

 i don't have accurate measurements on how much real-world performance
 difference there is between a core i7 and an intel 5000.  it's generally not
 spectacular, clock-for-clock. on the other hand, when the intel 5000-series
 was released, the rule of thumb for a sata hd was 50mb/s.  it's not too hard
 to find regular sata hard drives that do 110mb/s today.  the ssd drives we've
 (coraid) tested have been spectacular --- reading at  200mb/s.  if you want
 to talk latency, ssds can deliver 1/100th the latency of spinning media.
 there's no way that the core i7 is 100x faster than the intel 5000.

For the costs (in terms of power and durability) hard drives are
really a pain, not just for some of the companies I've talked to that
are burning out terabyte drives in a matter of weeks, but for mere
mortals as well. And I'm sorry but the performance of hard drives is
*not* very good, despite it improving. Every time I do something on a
large directory tree, my drive (which is a model from last year)
grinds and moans and takes, IMO, too long to do things. Putting 4GB of
RAM in my computer helped, but the buffering algorithms aren't
psychic, so I still pay a penalty the first time I use certain
directories.

Now I haven't tested an SSD for performance, but I know they are
better. If I got one, this problem would likely subside, but I'm not
convinced that SSDs are durable enough, despite what the manufacturers
say. I haven't seen many torture tests on them, but the fact that
erasing a block destroys it a little bit is scary. I do a lot of
sustained writes with my typical desktop workload over the same files,
and I'd rather not trust them to something that is delicate enough to
need filesystem algorithms to be optimized for so they don't wear
out.

I guess, in essence, I just want my flying car today.


 - erik





Re: [9fans] threads vs forks

2009-03-04 Thread ron minnich
On Wed, Mar 4, 2009 at 8:52 AM, J.R. Mauro jrm8...@gmail.com wrote:

 Now I haven't tested an SSD for performance, but I know they are
 better.

Well that I don't understand at all. Is this faith-based performance
measurement? :-)

I have a friend who is doing lots of SSD testing and they're not
always better. For some cases, you pay a whole lot more for 2x greater
throughput.

it's not as simple as know they are better.

If I got one, this problem would likely subside, but I'm not
 convinced that SSDs are durable enough, despite what the manufacturers
 say. I haven't seen many torture tests on them, but the fact that
 erasing a block destroys it a little bit is scary. I do a lot of
 sustained writes with my typical desktop workload over the same files,
 and I'd rather not trust them to something that is delicate enough to
 need filesystem algorithms to be optimized for so they don't wear
 out.

in most cases write leveling is not in the file system. It's in the
hardware or in a powerpc that is in the SSD controller.  It's worth
your doing some reading here.

That said, I sure would like to have a fusion IO card for venti. From
what my friend is telling me the fusion card would be ideal for venti
-- as long as we keep only the arenas  on it.

ron



Re: [9fans] threads vs forks

2009-03-04 Thread erik quanstrom
 That said, I sure would like to have a fusion IO card for venti. From
 what my friend is telling me the fusion card would be ideal for venti
 -- as long as we keep only the arenas  on it.

even better for ken's fs.  i would imagine the performance difference
between the fusion i/o card and mass storage is similar to that between
wrens and the jukebox.

- erik



Re: [9fans] threads vs forks

2009-03-04 Thread J.R. Mauro
On Wed, Mar 4, 2009 at 12:14 PM, ron minnich rminn...@gmail.com wrote:
 On Wed, Mar 4, 2009 at 8:52 AM, J.R. Mauro jrm8...@gmail.com wrote:

 Now I haven't tested an SSD for performance, but I know they are
 better.

 Well that I don't understand at all. Is this faith-based performance
 measurement? :-)

No, I have seen several benchmarks. The benchmarks I haven't seen are
ones for how long does it take to actually break these drives? from
anyone other than the manufacturer.


 I have a friend who is doing lots of SSD testing and they're not
 always better. For some cases, you pay a whole lot more for 2x greater
 throughput.

 it's not as simple as know they are better.

What types of things degrade their performance? I'm interested in
seeing other data than the handful of benchmarks I've seen. I imagine
writes would be the culprit since you have to erase a whole block
first?


If I got one, this problem would likely subside, but I'm not
 convinced that SSDs are durable enough, despite what the manufacturers
 say. I haven't seen many torture tests on them, but the fact that
 erasing a block destroys it a little bit is scary. I do a lot of
 sustained writes with my typical desktop workload over the same files,
 and I'd rather not trust them to something that is delicate enough to
 need filesystem algorithms to be optimized for so they don't wear
 out.

 in most cases write leveling is not in the file system. It's in the
 hardware or in a powerpc that is in the SSD controller.  It's worth
 your doing some reading here.

I've seen a lot about optimizing the next-generation filesystems for
flash. Despite the claims that the hardware-based solutions will be
satisfactory, there are a lot of people interested in making existing
filesystems smarter about SSDs, both for wear and for optimizing
read/write.

Beyond that, though, I feel very shaky just hearing the term wear
leveling. I've had more flash-based devices fail on me than hard
drives, but maybe I'm just crazy and the technology has gotten decent
enough in the past couple years to allay my worrying. It would just be
nice to see a bit stronger alternative being pushed as hard as SSDs.


 That said, I sure would like to have a fusion IO card for venti. From
 what my friend is telling me the fusion card would be ideal for venti
 -- as long as we keep only the arenas  on it.

 ron





Re: [9fans] threads vs forks

2009-03-04 Thread erik quanstrom
 On Wed, Mar 04, 2009 at 10:32:55PM -0500, J.R. Mauro wrote:
  What types of things degrade their performance? I'm interested in
  seeing other data than the handful of benchmarks I've seen. I imagine
  writes would be the culprit since you have to erase a whole block
  first?
 
 Being full.  Small random writes, too, although much more so for
 run-of-the-mill SSDs than for FusionIO.

[citation needed]

- erik



[9fans] threads vs forks

2009-03-03 Thread hugo rivera
Hi,
this is not really a plan 9 question, but since you are the wisest
guys I know I am hoping that you can help me.
You see, I have to launch many tasks running in parallel (~5000) in a
cluster running linux. Each of the task performs some astronomical
calculations and I am not pretty sure if using fork is the best answer
here.
First of all, all the programming is done in python and c, and since
we are using os.fork() python facility I think that it is somehow
related to the underlying c fork (well, I really do not know much of
forks in linux, the few things I do know about forks and threads I got
them from Francisco Ballesteros' Introduction to operating system
abstractions).
The point here is if I should use forks or threads to deal with the job at hand?
I heard that there are some problems if you fork too many processes (I
am not sure how many are too many) so I am thinking to use threads.
I know some basic differences between threads and forks, but I am not
aware of the details of the implementation (probably I will never be).
Finally, if this is a question that does not belong to the plan 9
mailing list, please let me know and I'll shut up.
Saludos

-- 
Hugo



Re: [9fans] threads vs forks

2009-03-03 Thread David Leimbach
On Tue, Mar 3, 2009 at 3:52 AM, hugo rivera uai...@gmail.com wrote:

 Hi,
 this is not really a plan 9 question, but since you are the wisest
 guys I know I am hoping that you can help me.
 You see, I have to launch many tasks running in parallel (~5000) in a
 cluster running linux. Each of the task performs some astronomical
 calculations and I am not pretty sure if using fork is the best answer
 here.
 First of all, all the programming is done in python and c, and since
 we are using os.fork() python facility I think that it is somehow
 related to the underlying c fork (well, I really do not know much of
 forks in linux, the few things I do know about forks and threads I got
 them from Francisco Ballesteros' Introduction to operating system
 abstractions).


My knowledge on this subject is about 8 or 9 years old, so check with
your local Python guru

The last I'd heard about Python's threading is that it was cooperative only,
and that you couldn't get real parallelism out of it.  It serves as a means
to organize your program in a concurrent manner.

In other words no two threads run at the same time in Python, even if you're
on a multi-core system, due to something they call a Global Interpreter
Lock.



 The point here is if I should use forks or threads to deal with the job at
 hand?
 I heard that there are some problems if you fork too many processes (I
 am not sure how many are too many) so I am thinking to use threads.
 I know some basic differences between threads and forks, but I am not
 aware of the details of the implementation (probably I will never be).
 Finally, if this is a question that does not belong to the plan 9
 mailing list, please let me know and I'll shut up.
 Saludos


I think you need to understand the system limits, which is something you can
look up for yourself.  Also you should understand what kind of runtime model
threads in the language you're using actually implements.

Those rules basically apply to any system.



 --
 Hugo




Re: [9fans] threads vs forks

2009-03-03 Thread hugo rivera
thanks a lot guys.
I think I should study this issue in greater detail. It is not as easy
as I tought it would be.

2009/3/3, David Leimbach leim...@gmail.com:


 On Tue, Mar 3, 2009 at 3:52 AM, hugo rivera uai...@gmail.com wrote:
  Hi,
  this is not really a plan 9 question, but since you are the wisest
  guys I know I am hoping that you can help me.
  You see, I have to launch many tasks running in parallel (~5000) in a
  cluster running linux. Each of the task performs some astronomical
  calculations and I am not pretty sure if using fork is the best answer
  here.
  First of all, all the programming is done in python and c, and since
  we are using os.fork() python facility I think that it is somehow
  related to the underlying c fork (well, I really do not know much of
  forks in linux, the few things I do know about forks and threads I got
  them from Francisco Ballesteros' Introduction to operating system
  abstractions).

 My knowledge on this subject is about 8 or 9 years old, so
 check with your local Python guru

 The last I'd heard about Python's threading is that it was cooperative only,
 and that you couldn't get real parallelism out of it.  It serves as a means
 to organize your program in a concurrent manner.

 In other words no two threads run at the same time in Python, even if you're
 on a multi-core system, due to something they call a Global Interpreter
 Lock.

 
  The point here is if I should use forks or threads to deal with the job at
 hand?
  I heard that there are some problems if you fork too many processes (I
  am not sure how many are too many) so I am thinking to use threads.
  I know some basic differences between threads and forks, but I am not
  aware of the details of the implementation (probably I will never be).
  Finally, if this is a question that does not belong to the plan 9
  mailing list, please let me know and I'll shut up.
  Saludos
 

 I think you need to understand the system limits, which is something you can
 look up for yourself.  Also you should understand what kind of runtime model
 threads in the language you're using actually implements.

 Those rules basically apply to any system.

 
  --
  Hugo
 
 




-- 
Hugo



Re: [9fans] threads vs forks

2009-03-03 Thread Uriel
Python 'threads' are the same pthreads turds all other lunix junk
uses. The only difference is that the interpreter itself is not
threadsafe, so they have a global lock which means threads suck even
more than usual.

Forking a python interpreter is a *bad* idea, because python's start
up takes billions of years. This has nothing to do with the merits of
fork, and all with how much python sucks.

There is Stackless Python, which has proper CSP threads/procs and
channels, very similar to limbo.

http://www.stackless.com/

But that is too sane for the mainline python folks obviously, so they
stick to the pthrereads turds, ...

My advice: unless you can use Stackless, stay as far away as you can
from any concurrent python stuff. (And don't get me started on twisted
and their event based hacks).

Oh, and as I mentioned in another thread, in my experience if you are
going to fork, make sure you compile statically, dynamic linking is
almost as evil as pthreads. But this is lunix, so what do you expect?

uriel

On Tue, Mar 3, 2009 at 4:19 PM, David Leimbach leim...@gmail.com wrote:


 On Tue, Mar 3, 2009 at 3:52 AM, hugo rivera uai...@gmail.com wrote:

 Hi,
 this is not really a plan 9 question, but since you are the wisest
 guys I know I am hoping that you can help me.
 You see, I have to launch many tasks running in parallel (~5000) in a
 cluster running linux. Each of the task performs some astronomical
 calculations and I am not pretty sure if using fork is the best answer
 here.
 First of all, all the programming is done in python and c, and since
 we are using os.fork() python facility I think that it is somehow
 related to the underlying c fork (well, I really do not know much of
 forks in linux, the few things I do know about forks and threads I got
 them from Francisco Ballesteros' Introduction to operating system
 abstractions).

 My knowledge on this subject is about 8 or 9 years old, so check with your local Python guru
 The last I'd heard about Python's threading is that it was cooperative only,
 and that you couldn't get real parallelism out of it.  It serves as a means
 to organize your program in a concurrent manner.
 In other words no two threads run at the same time in Python, even if you're
 on a multi-core system, due to something they call a Global Interpreter
 Lock.


 The point here is if I should use forks or threads to deal with the job at
 hand?
 I heard that there are some problems if you fork too many processes (I
 am not sure how many are too many) so I am thinking to use threads.
 I know some basic differences between threads and forks, but I am not
 aware of the details of the implementation (probably I will never be).
 Finally, if this is a question that does not belong to the plan 9
 mailing list, please let me know and I'll shut up.
 Saludos

 I think you need to understand the system limits, which is something you can
 look up for yourself.  Also you should understand what kind of runtime model
 threads in the language you're using actually implements.
 Those rules basically apply to any system.


 --
 Hugo






Re: [9fans] threads vs forks

2009-03-03 Thread ron minnich
On Tue, Mar 3, 2009 at 3:52 AM, hugo rivera uai...@gmail.com wrote:

 You see, I have to launch many tasks running in parallel (~5000) in a
 cluster running linux. Each of the task performs some astronomical
 calculations and I am not pretty sure if using fork is the best answer
 here.


lots of questions first .

how  many cluster nodes. how long do the jobs run. input files or
args? output files? how big? You can't say much with the information
you gave.

ron



Re: [9fans] threads vs forks

2009-03-03 Thread hugo rivera
2009/3/3, Uriel urie...@gmail.com:

  Oh, and as I mentioned in another thread, in my experience if you are
  going to fork, make sure you compile statically, dynamic linking is
  almost as evil as pthreads. But this is lunix, so what do you expect?


not much. Wish I could get it done with plan 9.

-- 
Hugo



Re: [9fans] threads vs forks

2009-03-03 Thread hugo rivera
2009/3/3, ron minnich rminn...@gmail.com:

 lots of questions first .

  how  many cluster nodes. how long do the jobs run. input files or
  args? output files? how big? You can't say much with the information
  you gave.

It is a small cluster, of 6 machines. I think each job runs for a few
minutes (~5), take some input files and generate a couple of files (I
am not really sure about how many output files each proccess
generates). The size of the output files is ~1Mb.

-- 
Hugo



Re: [9fans] threads vs forks

2009-03-03 Thread John Barham
On Tue, Mar 3, 2009 at 3:52 AM, hugo rivera uai...@gmail.com wrote:

 I have to launch many tasks running in parallel (~5000) in a
 cluster running linux. Each of the task performs some astronomical
 calculations and I am not pretty sure if using fork is the best answer
 here.
 First of all, all the programming is done in python and c...

Take a look at the multiprocessing package
(http://docs.python.org/library/multiprocessing.html), newly
introduced with Python 2.6 and 3.0:

multiprocessing is a package that supports spawning processes using
an API similar to the threading module. The multiprocessing package
offers both local and remote concurrency, effectively side-stepping
the Global Interpreter Lock by using subprocesses instead of threads.

It should be a quick and easy way to set up a cluster-wide job
processing system (provided all your jobs are driven by Python).

It also looks like it's been (partially?) back-ported to Python 2.4
and 2.5: http://pypi.python.org/pypi/processing.

  John



Re: [9fans] threads vs forks

2009-03-03 Thread ron minnich
On Tue, Mar 3, 2009 at 8:28 AM, hugo rivera uai...@gmail.com wrote:

 It is a small cluster, of 6 machines. I think each job runs for a few
 minutes (~5), take some input files and generate a couple of files (I
 am not really sure about how many output files each proccess
 generates). The size of the output files is ~1Mb.

for that size cluster, and jobs running a few minutes, fork ought to be fine.

ron



Re: [9fans] threads vs forks

2009-03-03 Thread Roman V. Shaposhnik
On Tue, 2009-03-03 at 07:19 -0800, David Leimbach wrote:

 My knowledge on this subject is about 8 or 9 years old, so check with your 
 local Python guru
 
 
 The last I'd heard about Python's threading is that it was cooperative
 only, and that you couldn't get real parallelism out of it.  It serves
 as a means to organize your program in a concurrent manner.  
 
 
 In other words no two threads run at the same time in Python, even if
 you're on a multi-core system, due to something they call a Global
 Interpreter Lock.  

I believe GIL is as present in Python nowadays as ever. On a related
note: does anybody know any sane interpreted languages with a decent
threading model to go along? Stackless python is the only thing that
I'm familiar with in that department.

Thanks,
Roman.




Re: [9fans] threads vs forks

2009-03-03 Thread Bakul Shah
On Tue, 03 Mar 2009 10:11:10 PST Roman V. Shaposhnik r...@sun.com  wrote:
 On Tue, 2009-03-03 at 07:19 -0800, David Leimbach wrote:
 
  My knowledge on this subject is about 8 or 9 years old, so check with your 
 local Python guru
  
  
  The last I'd heard about Python's threading is that it was cooperative
  only, and that you couldn't get real parallelism out of it.  It serves
  as a means to organize your program in a concurrent manner.  
  
  
  In other words no two threads run at the same time in Python, even if
  you're on a multi-core system, due to something they call a Global
  Interpreter Lock.  
 
 I believe GIL is as present in Python nowadays as ever. On a related
 note: does anybody know any sane interpreted languages with a decent
 threading model to go along? Stackless python is the only thing that
 I'm familiar with in that department.

Depend on what you mean by sane interpreted language with a
decent threading model and what you want to do with it but
check out www.clojure.org.  Then there is Erlang.  Its
wikipedia entry has this to say:
Although Erlang was designed to fill a niche and has
remained an obscure language for most of its existence,
it is experiencing a rapid increase in popularity due to
increased demand for concurrent services, inferior models
of concurrency in most mainstream programming languages,
and its substantial libraries and documentation.[7][8]
Well-known applications include Amazon SimpleDB,[9]
Yahoo! Delicious,[10] and the Facebook Chat system.[11]



Re: [9fans] threads vs forks

2009-03-03 Thread J.R. Mauro
On Tue, Mar 3, 2009 at 1:11 PM, Roman V. Shaposhnik r...@sun.com wrote:
 On Tue, 2009-03-03 at 07:19 -0800, David Leimbach wrote:

 My knowledge on this subject is about 8 or 9 years old, so check with your 
 local Python guru


 The last I'd heard about Python's threading is that it was cooperative
 only, and that you couldn't get real parallelism out of it.  It serves
 as a means to organize your program in a concurrent manner.


 In other words no two threads run at the same time in Python, even if
 you're on a multi-core system, due to something they call a Global
 Interpreter Lock.

 I believe GIL is as present in Python nowadays as ever. On a related
 note: does anybody know any sane interpreted languages with a decent
 threading model to go along? Stackless python is the only thing that
 I'm familiar with in that department.

I thought part of the reason for the big break with Python 3000 was
to get rid of the GIL and clean that threading mess up. Or am I way
off?


 Thanks,
 Roman.






Re: [9fans] threads vs forks

2009-03-03 Thread Uriel
You are off. It is doubtful that the GIL will ever be removed.

But that really isn't the issue, the issue is the lack of a decent
concurrency model, like the one provided by Stackless.

But apparently one of the things stackless allows is evil recursive
programming, which Guido considers 'confusing' and wont allow in
mainline python (I think another reason is that porting it to jython
and .not would be hard, but I'm not familiar with the details).

uriel


On Wed, Mar 4, 2009 at 12:08 AM, J.R. Mauro jrm8...@gmail.com wrote:
 On Tue, Mar 3, 2009 at 1:11 PM, Roman V. Shaposhnik r...@sun.com wrote:
 On Tue, 2009-03-03 at 07:19 -0800, David Leimbach wrote:

 My knowledge on this subject is about 8 or 9 years old, so check with your 
 local Python guru


 The last I'd heard about Python's threading is that it was cooperative
 only, and that you couldn't get real parallelism out of it.  It serves
 as a means to organize your program in a concurrent manner.


 In other words no two threads run at the same time in Python, even if
 you're on a multi-core system, due to something they call a Global
 Interpreter Lock.

 I believe GIL is as present in Python nowadays as ever. On a related
 note: does anybody know any sane interpreted languages with a decent
 threading model to go along? Stackless python is the only thing that
 I'm familiar with in that department.

 I thought part of the reason for the big break with Python 3000 was
 to get rid of the GIL and clean that threading mess up. Or am I way
 off?


 Thanks,
 Roman.








Re: [9fans] threads vs forks

2009-03-03 Thread J.R. Mauro
On Tue, Mar 3, 2009 at 6:15 PM, Uriel urie...@gmail.com wrote:
 You are off. It is doubtful that the GIL will ever be removed.

That's too bad. Things like that just reinforce my view that Python is a hack :(

Oh well, back to C...


 But that really isn't the issue, the issue is the lack of a decent
 concurrency model, like the one provided by Stackless.

 But apparently one of the things stackless allows is evil recursive
 programming, which Guido considers 'confusing' and wont allow in
 mainline python (I think another reason is that porting it to jython
 and .not would be hard, but I'm not familiar with the details).

Concurrency seems to be one of those things that's too hard for
everyone, and I don't buy it. There's no reason it needs to be as hard
as it is.

And nevermind the fact that it's not really usable for every (or even
most) jobs out there. But Intel is pushing it, so that's where we have
to go, I suppose.


 uriel
 - Show quoted text -

 On Wed, Mar 4, 2009 at 12:08 AM, J.R. Mauro jrm8...@gmail.com wrote:
 On Tue, Mar 3, 2009 at 1:11 PM, Roman V. Shaposhnik r...@sun.com wrote:
 On Tue, 2009-03-03 at 07:19 -0800, David Leimbach wrote:

 My knowledge on this subject is about 8 or 9 years old, so check with your 
 local Python guru


 The last I'd heard about Python's threading is that it was cooperative
 only, and that you couldn't get real parallelism out of it.  It serves
 as a means to organize your program in a concurrent manner.


 In other words no two threads run at the same time in Python, even if
 you're on a multi-core system, due to something they call a Global
 Interpreter Lock.

 I believe GIL is as present in Python nowadays as ever. On a related
 note: does anybody know any sane interpreted languages with a decent
 threading model to go along? Stackless python is the only thing that
 I'm familiar with in that department.

 I thought part of the reason for the big break with Python 3000 was
 to get rid of the GIL and clean that threading mess up. Or am I way
 off?


 Thanks,
 Roman.










Re: [9fans] threads vs forks

2009-03-03 Thread Devon H. O'Dell
2009/3/3 J.R. Mauro jrm8...@gmail.com:
 Concurrency seems to be one of those things that's too hard for
 everyone, and I don't buy it. There's no reason it needs to be as hard
 as it is.

That's a fact. If you have access to The ACM Queue, check out
p16-cantrill-concurrency.pdf (Cantrill and Bonwich on concurrency).

 And nevermind the fact that it's not really usable for every (or even
 most) jobs out there. But Intel is pushing it, so that's where we have
 to go, I suppose.

That's simply not true. In my world (server software and networking),
most tasks can be improved by utilizing concurrent programming
paradigms. Even in user interfaces, these are useful. For mathematics,
there's simply no question that making use of concurrent algorithms is
a win. In fact, I can't think of a single case in which doing two
lines of work at once isn't better than doing one at a time, assuming
that accuracy is maintained in the result.

--dho



Re: [9fans] threads vs forks

2009-03-03 Thread J.R. Mauro
On Tue, Mar 3, 2009 at 6:54 PM, Devon H. O'Dell devon.od...@gmail.com wrote:
 2009/3/3 J.R. Mauro jrm8...@gmail.com:
 Concurrency seems to be one of those things that's too hard for
 everyone, and I don't buy it. There's no reason it needs to be as hard
 as it is.

 That's a fact. If you have access to The ACM Queue, check out
 p16-cantrill-concurrency.pdf (Cantrill and Bonwich on concurrency).

Things like TBB and other libraries to automagically scale up repeated
operations into parallelized ones help alleviate the problems with
getting parallelization to work. They're ugly, they only address
narrow problem sets, but they're attempts at solutions. And if you
look at languages like LISP and Erlang, you're definitely left with a
feeling that parallelization is being treated as harder than it is.

I'm not saying it isn't hard, just that there are a lot of people who
seem to be throwing up their hands over it. I suppose I should stop
reading their material.


 And nevermind the fact that it's not really usable for every (or even
 most) jobs out there. But Intel is pushing it, so that's where we have
 to go, I suppose.

 That's simply not true. In my world (server software and networking),
 most tasks can be improved by utilizing concurrent programming
 paradigms. Even in user interfaces, these are useful. For mathematics,
 there's simply no question that making use of concurrent algorithms is
 a win. In fact, I can't think of a single case in which doing two
 lines of work at once isn't better than doing one at a time, assuming
 that accuracy is maintained in the result.

I should have qualified. I mean *massive* parallelization when applied
to average use cases. I don't think it's totally unusable (I
complain about synchronous I/O on my phone every day), but it's being
pushed as a panacea, and that is what I think is wrong. Don Knuth
holds this opinion, but I think he's mostly alone on that,
unfortunately.

Of course for mathematically intensive and large-scale operations, the
more parallel you can make things the better.


 --dho





Re: [9fans] threads vs forks

2009-03-03 Thread erik quanstrom
 I should have qualified. I mean *massive* parallelization when applied
 to average use cases. I don't think it's totally unusable (I
 complain about synchronous I/O on my phone every day), but it's being
 pushed as a panacea, and that is what I think is wrong. Don Knuth
 holds this opinion, but I think he's mostly alone on that,
 unfortunately.

it's interesting that parallel wasn't cool when chips were getting
noticably faster rapidly.  perhaps the focus on parallelization
is a sign there aren't any other ideas.

- erik



Re: [9fans] threads vs forks

2009-03-03 Thread J.R. Mauro
On Tue, Mar 3, 2009 at 7:54 PM, erik quanstrom quans...@quanstro.net wrote:
 I should have qualified. I mean *massive* parallelization when applied
 to average use cases. I don't think it's totally unusable (I
 complain about synchronous I/O on my phone every day), but it's being
 pushed as a panacea, and that is what I think is wrong. Don Knuth
 holds this opinion, but I think he's mostly alone on that,
 unfortunately.

 it's interesting that parallel wasn't cool when chips were getting
 noticably faster rapidly.  perhaps the focus on parallelization
 is a sign there aren't any other ideas.

Indeed, I think it is. The big manufacturers seem to have hit a wall
with clock speed, done a full reverse, and are now just trying to pack
more transistors and cores on the chip. Not that this is evil, but I
think this is just as bad as the obsession with upping the clock
speeds in that they're too focused on one path instead of
incorporating other cool ideas (i.e., things Transmeta was working on
with virtualization and hosting foreign ISAs)


 - erik





Re: [9fans] threads vs forks

2009-03-03 Thread John Barham
On Tue, Mar 3, 2009 at 4:54 PM, erik quanstrom quans...@quanstro.net wrote:
 I should have qualified. I mean *massive* parallelization when applied
 to average use cases. I don't think it's totally unusable (I
 complain about synchronous I/O on my phone every day), but it's being
 pushed as a panacea, and that is what I think is wrong. Don Knuth
 holds this opinion, but I think he's mostly alone on that,
 unfortunately.

 it's interesting that parallel wasn't cool when chips were getting
 noticably faster rapidly.  perhaps the focus on parallelization
 is a sign there aren't any other ideas.

That seems to be what Knuth thinks.  Excerpt from a 2008 interview w/ InformIT:

InformIT: Vendors of multicore processors have expressed frustration
at the difficulty of moving developers to this model. As a former
professor, what thoughts do you have on this transition and how to
make it happen? Is it a question of proper tools, such as better
native support for concurrency in languages, or of execution
frameworks? Or are there other solutions?

Knuth: I don’t want to duck your question entirely. I might as well
flame a bit about my personal unhappiness with the current trend
toward multicore architecture. To me, it looks more or less like the
hardware designers have run out of ideas, and that they’re trying to
pass the blame for the future demise of Moore’s Law to the software
writers by giving us machines that work faster only on a few key
benchmarks! I won’t be surprised at all if the whole multithreading
idea turns out to be a flop, worse than the Itanium approach that
was supposed to be so terrific—until it turned out that the wished-for
compilers were basically impossible to write.

Full interview is at http://www.informit.com/articles/article.aspx?p=1193856.



Re: [9fans] threads vs forks

2009-03-03 Thread James Tomaschke

J.R. Mauro wrote:

On Tue, Mar 3, 2009 at 7:54 PM, erik quanstrom quans...@quanstro.net wrote:

I should have qualified. I mean *massive* parallelization when applied
to average use cases. I don't think it's totally unusable (I
complain about synchronous I/O on my phone every day), but it's being
pushed as a panacea, and that is what I think is wrong. Don Knuth
holds this opinion, but I think he's mostly alone on that,
unfortunately.

it's interesting that parallel wasn't cool when chips were getting
noticably faster rapidly.  perhaps the focus on parallelization
is a sign there aren't any other ideas.


Indeed, I think it is. The big manufacturers seem to have hit a wall
with clock speed, done a full reverse, and are now just trying to pack
more transistors and cores on the chip. Not that this is evil, but I
think this is just as bad as the obsession with upping the clock
speeds in that they're too focused on one path instead of
incorporating other cool ideas (i.e., things Transmeta was working on
with virtualization and hosting foreign ISAs)


Die size has been the main focus for the foundries, reduced transistor 
switch time is just a benefit from that.  Digital components work well 
here, but Analog suffers and creating a stable clock at high frequency 
is done in the Analog domain.


It is much easier to double the transistor count than it is to double 
the clock frequency.  Also have to consider the power/heat/noise costs 
from increasing the clock.


I think the reason why you didn't see parallelism come out earlier in 
the PC market was because they needed to create new mechanisms for I/O. 
 AMD did this with Hypertransport, and I've seen 32-core (8-socket) 
systems with this.  Now Intel has their own I/O rethink out there.


I've been trying to get my industry to look at parallel computing for 
many years, and it's only now that they are starting to sell parallel 
circuit simulators and still they are not that efficient.  A 
traditionally week-long sim is now taking a single day when run on 
12-cores.  I'll take that 7x over 1x anytime though.


/james



Re: [9fans] threads vs forks

2009-03-03 Thread erik quanstrom
 I think the reason why you didn't see parallelism come out earlier in 
 the PC market was because they needed to create new mechanisms for I/O. 
   AMD did this with Hypertransport, and I've seen 32-core (8-socket) 
 systems with this.  Now Intel has their own I/O rethink out there.

i think what you're saying is equivalent to saying
(in terms i understand) that memory bandwidth was
so bad that a second processor couldn't do much work.

but i haven't found this to be the case.  even the
highly constrained pentium 4 gets some milage out of
hyperthreading for the tests i've run.

the intel 5000-series still use a fsb.  and they seem to
scale well from 1 to 4 cores.

are there benchmarks that show otherwise similar
hypertransport systems trouncing intel in multithreaded
performance?  i don't recall seeing anything more than
a moderate (15-20%) advantage.

- erik



Re: [9fans] threads vs forks

2009-03-03 Thread James Tomaschke

erik quanstrom wrote:
I think the reason why you didn't see parallelism come out earlier in 
the PC market was because they needed to create new mechanisms for I/O. 
  AMD did this with Hypertransport, and I've seen 32-core (8-socket) 
systems with this.  Now Intel has their own I/O rethink out there.


i think what you're saying is equivalent to saying
(in terms i understand) that memory bandwidth was
so bad that a second processor couldn't do much work.

Yes bandwidth and latency.


but i haven't found this to be the case.  even the
highly constrained pentium 4 gets some milage out of
hyperthreading for the tests i've run.

the intel 5000-series still use a fsb.  and they seem to
scale well from 1 to 4 cores.


Many of the circuit simulators I use fall flat on their face after 4 
cores, say.  However I blame this on their algorithm not hardware.


I wasn't making an AMD vs Intel comment, just that AMD had created HTX 
along with their K8 platform to address scalability concerns with I/O.



are there benchmarks that show otherwise similar
hypertransport systems trouncing intel in multithreaded
performance?  i don't recall seeing anything more than
a moderate (15-20%) advantage.


I don't have a 16-core Intel system to compare with, but:
http://en.wikipedia.org/wiki/List_of_device_bandwidths#Computer_buses

I think the reason why Intel developed their Common Systems Interconnect 
(now called QuickPath Interconnect) was to address it's shortcomings.


Both AMD and Intel are looking at I/O because it is and will be a 
limiting factor when scaling to higher core counts.




- erik







Re: [9fans] threads vs forks

2009-03-03 Thread J.R. Mauro
On Tue, Mar 3, 2009 at 11:44 PM, James Tomaschke ja...@orcasystems.com wrote:
 erik quanstrom wrote:

 I think the reason why you didn't see parallelism come out earlier in the
 PC market was because they needed to create new mechanisms for I/O.  AMD did
 this with Hypertransport, and I've seen 32-core (8-socket) systems with
 this.  Now Intel has their own I/O rethink out there.

 i think what you're saying is equivalent to saying
 (in terms i understand) that memory bandwidth was
 so bad that a second processor couldn't do much work.

 Yes bandwidth and latency.

 but i haven't found this to be the case.  even the
 highly constrained pentium 4 gets some milage out of
 hyperthreading for the tests i've run.

 the intel 5000-series still use a fsb.  and they seem to
 scale well from 1 to 4 cores.

 Many of the circuit simulators I use fall flat on their face after 4 cores,
 say.  However I blame this on their algorithm not hardware.

 I wasn't making an AMD vs Intel comment, just that AMD had created HTX along
 with their K8 platform to address scalability concerns with I/O.

 are there benchmarks that show otherwise similar
 hypertransport systems trouncing intel in multithreaded
 performance?  i don't recall seeing anything more than
 a moderate (15-20%) advantage.

 I don't have a 16-core Intel system to compare with, but:
 http://en.wikipedia.org/wiki/List_of_device_bandwidths#Computer_buses

 I think the reason why Intel developed their Common Systems Interconnect
 (now called QuickPath Interconnect) was to address it's shortcomings.

 Both AMD and Intel are looking at I/O because it is and will be a limiting
 factor when scaling to higher core counts.

And soon hard disk latencies are really going to start hurting (they
already are hurting some, I'm sure), and I'm not convinced of the
viability of SSDs.


There was an interesting article I came across that compared the
latencies of accessing a register, a CPU cache, main memory, and disk,
which put them in human terms. As much as we like to say we understand
the difference between a millisecond and a nanosecond, seeing cache
access expressed in terms of moments and a disk access in terms of
years was rather illuminating, if only to me.

Same article also put a google search at only slightly slower latency
than hard disk access. The internet really is becoming the computer, I
suppose.



 - erik








Re: [9fans] threads vs forks

2009-03-03 Thread David Leimbach
On Tue, Mar 3, 2009 at 10:11 AM, Roman V. Shaposhnik r...@sun.com wrote:

 On Tue, 2009-03-03 at 07:19 -0800, David Leimbach wrote:

  My knowledge on this subject is about 8 or 9 years old, so check with
 your local Python guru
 
 
  The last I'd heard about Python's threading is that it was cooperative
  only, and that you couldn't get real parallelism out of it.  It serves
  as a means to organize your program in a concurrent manner.
 
 
  In other words no two threads run at the same time in Python, even if
  you're on a multi-core system, due to something they call a Global
  Interpreter Lock.

 I believe GIL is as present in Python nowadays as ever. On a related
 note: does anybody know any sane interpreted languages with a decent
 threading model to go along? Stackless python is the only thing that
 I'm familiar with in that department.


I'm a fan of Erlang.  Though I guess it's technically a compiled virtual
machine of sorts, even when it's escript.

But I've had an absolutely awesome experience over the last year using it,
and so far only wishing it came with the type safety of Haskell :-).

I love Haskell's threading model actually, in either the data parallelism or
the forkIO interface, it's pretty sane.  Typed data channels even between
forkIO'd threads.




 Thanks,
 Roman.





Re: [9fans] threads vs forks

2009-03-03 Thread David Leimbach
On Tue, Mar 3, 2009 at 5:54 PM, J.R. Mauro jrm8...@gmail.com wrote:

 On Tue, Mar 3, 2009 at 7:54 PM, erik quanstrom quans...@quanstro.net
 wrote:
  I should have qualified. I mean *massive* parallelization when applied
  to average use cases. I don't think it's totally unusable (I
  complain about synchronous I/O on my phone every day), but it's being
  pushed as a panacea, and that is what I think is wrong. Don Knuth
  holds this opinion, but I think he's mostly alone on that,
  unfortunately.
 
  it's interesting that parallel wasn't cool when chips were getting
  noticably faster rapidly.  perhaps the focus on parallelization
  is a sign there aren't any other ideas.

 Indeed, I think it is. The big manufacturers seem to have hit a wall
 with clock speed, done a full reverse, and are now just trying to pack
 more transistors and cores on the chip. Not that this is evil, but I
 think this is just as bad as the obsession with upping the clock
 speeds in that they're too focused on one path instead of
 incorporating other cool ideas (i.e., things Transmeta was working on
 with virtualization and hosting foreign ISAs)


Can we bring back the Burroughs? :-)




 
  - erik
 
 




Re: [9fans] threads vs forks

2009-03-03 Thread John Barham
 I believe GIL is as present in Python nowadays as ever. On a related
 note: does anybody know any sane interpreted languages with a decent
 threading model to go along? Stackless python is the only thing that
 I'm familiar with in that department.

Check out Lua's coroutines: http://www.lua.org/manual/5.1/manual.html#2.11

Here's an implementation of the sieve of Eratosthenes using Lua
coroutines similar to the Limbo one:
http://www.lua.org/cgi-bin/demo?sieve



Re: [9fans] threads vs forks

2009-03-03 Thread erik quanstrom
 Now there is another use that would at least be intellectually interesting
 and possible useful in practice.  Use the transistors for a really big
 memory running at cache speed.  But instead of it being a hardware
 cache, manage it explicitly.  In effect, we have a very high speed
 main memory, and the traditional main memory is backing store.
 It'd give a use for all those paging algorithms that aren't particularly
 justified at the main memory-disk boundary any more.  And you
 can fit a lot of Plan 9 executable images in a 64MB on-chip memory
 space.  Obviously, it wouldn't be a good fit for severely memory-hungry
 apps, and it might be a dead end overall, but it'd at least be something
 different...

ken's fs already has the machinery to handle this.  one could imagine
a cachefs that knew how to manage this for venti.  (though venti seems
like a poor fit.)  there are lots of interesting uses of explicitly managed,
heirarchical caches.  yet so far hardware has done it's level best to hide
this.

- erik



Re: [9fans] threads vs forks

2009-03-03 Thread erik quanstrom
 
  Both AMD and Intel are looking at I/O because it is and will be a limiting
  factor when scaling to higher core counts.

i/o starts sucking wind with one core.  
that's why we differentiate i/o from everything
else we do.

 And soon hard disk latencies are really going to start hurting (they
 already are hurting some, I'm sure), and I'm not convinced of the
 viability of SSDs.

i'll assume you mean throughput.  hard drive latency has been a big deal
for a long time.  tanenbaum integrated knowledge of track layout into
his minix elevator algorithm.

i think the gap between cpu performance and hd performance is narrowing,
not getting wider.

i don't have accurate measurements on how much real-world performance
difference there is between a core i7 and an intel 5000.  it's generally not
spectacular, clock-for-clock. on the other hand, when the intel 5000-series
was released, the rule of thumb for a sata hd was 50mb/s.  it's not too hard
to find regular sata hard drives that do 110mb/s today.  the ssd drives we've
(coraid) tested have been spectacular --- reading at  200mb/s.  if you want
to talk latency, ssds can deliver 1/100th the latency of spinning media.
there's no way that the core i7 is 100x faster than the intel 5000.

- erik



Re: [9fans] threads vs forks

2009-03-03 Thread andrey mirtchovski
 the ssd drives we've
 (coraid) tested have been spectacular --- reading at  200mb/s.

you know, i've read all the reviews and seen all the windows
benchmarks. but this info, coming from somebody on this list, is much
more assuring than all the slashdot articles.

the tests didn't involve plan9 by any chance, did they? ;)