Re: [Haifux] Implementing read() like UNIX guys like it

2011-04-26 Thread Shachar Raindel
On Tue, Apr 26, 2011 at 11:33 PM, Nadav Har'El  wrote:
> On Tue, Apr 26, 2011, Eli Billauer wrote about "Re: [Haifux] Implementing 
> read() like UNIX guys like it":
>> >(...) Second, if the CPU *did* have "something useful" to do (run other
>> >processes,
>> >or whatever), it would, causing a bit more time to pass between the read()
>> >and it might return more than one byte. It will only return one byte when
>> >there's nothing better to do than calling read() all the time.
>> >
>> That's an interesting point, but I'm not 100% sure on that one. Why
>> would the scheduler take the CPU away, if the read() operation always
>> returns one byte or two? It would, eventually take it for a while, which
>> would let data stack up, but each process would get its fair slice.
>
> Hi,
>
> Well, I guess that under a very specific set of circumstances (and a fairly
> high throughput, on a modern CPU which can do a billion cycles a second),
> the read() of one byte will take exactly the amount of time that it takes
> for another byte to become ready. But when the incoming stream is slower than
> that, a read() will very often block, and can cause a context switch if
> other processes are waiting to run.
>

Note that if you have already implemented the "give away the CPU while
no input, until input arrives", adding the delay bit becomes
relatively simple (storing the jiffies value when writing the first
byte in the buffer, comparing it to now, if smaller than minimum,
going into timed sleep).

>> I'm not saying this would cause some real degradation, but on a slow
>> embedded processor, seeing 10% CPU usage on a process which is supposed
>> to just read data...? The calculation is a 200 MHz processor, 100 kB/sec
>> and 200 CPU clocks for each byte being processed on itself.
>
> I see. Like I said, I don't know how meaningful this "10%" figure is when
> you're talking about an endless-read()-busy-loop with nothing else to do
> (if there's nothing else to do, you don't care if it takes 100% cpu :-)).
> If you do have other things to do - in the same thread or in a different
> thread, the possibly this 10% figure would be reduced because read()s would
> start returning more than 1 byte. By the way, if in your hypothetical
> situation a read() takes 200 cycles to return, but the bytes arrive 2,000
> cycles apart, the following read() will block, and therefore possibly cause a
> context switch.

2 reasons why this might be important:
a. the impact of a "10%" CPU using task is above 10% of the processing
power - it pollutes caches, eats out CPU time in context switches as
well, and prevents "idle level" (nice'd) tasks from running.
b. a task using 10% CPU will prevent the CPU from becoming idle for
significant time slice. A non-idle CPU eats up much more energy than
an idle CPU. While Eli's application might be not so power sensitive,
many applications are. In some cases, the linux kernel goes into great
distances in postponing work and attempting to batch it to reduce the
amount of hardware wakeups required (see the tickless kernel efforts,
powertop and the related changes across the entire system, and the
laptop-mode hard-disk writes handling).

>
> But if you have the time and energy to program that additional read delay,
> then by all means, go ahead and try it. And do tell how much the difference
> was noticable in the end system.

I join the request about the "do tell". I would be especially
interested in hearing results of benchmark such as "run nice'd CPU
intensive task in parallel to the hardware interacting task, check
change in time-to-complete the CPU intensive task".

--Shachar
___
Haifux mailing list
Haifux@haifux.org
http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux


Re: [Haifux] Implementing read() like UNIX guys like it

2011-04-26 Thread Nadav Har'El
On Tue, Apr 26, 2011, Eli Billauer wrote about "Re: [Haifux] Implementing 
read() like UNIX guys like it":
> >(...) Second, if the CPU *did* have "something useful" to do (run other 
> >processes,
> >or whatever), it would, causing a bit more time to pass between the read()
> >and it might return more than one byte. It will only return one byte when
> >there's nothing better to do than calling read() all the time.
> >  
> That's an interesting point, but I'm not 100% sure on that one. Why 
> would the scheduler take the CPU away, if the read() operation always 
> returns one byte or two? It would, eventually take it for a while, which 
> would let data stack up, but each process would get its fair slice.

Hi,

Well, I guess that under a very specific set of circumstances (and a fairly
high throughput, on a modern CPU which can do a billion cycles a second),
the read() of one byte will take exactly the amount of time that it takes
for another byte to become ready. But when the incoming stream is slower than
that, a read() will very often block, and can cause a context switch if
other processes are waiting to run.

> I'm not saying this would cause some real degradation, but on a slow 
> embedded processor, seeing 10% CPU usage on a process which is supposed 
> to just read data...? The calculation is a 200 MHz processor, 100 kB/sec 
> and 200 CPU clocks for each byte being processed on itself.

I see. Like I said, I don't know how meaningful this "10%" figure is when
you're talking about an endless-read()-busy-loop with nothing else to do
(if there's nothing else to do, you don't care if it takes 100% cpu :-)).
If you do have other things to do - in the same thread or in a different
thread, the possibly this 10% figure would be reduced because read()s would
start returning more than 1 byte. By the way, if in your hypothetical
situation a read() takes 200 cycles to return, but the bytes arrive 2,000
cycles apart, the following read() will block, and therefore possibly cause a
context switch.

But if you have the time and energy to program that additional read delay,
then by all means, go ahead and try it. And do tell how much the difference
was noticable in the end system.

-- 
Nadav Har'El|  Tuesday, Apr 26 2011, 23 Nisan 5771
n...@math.technion.ac.il |-
Phone +972-523-790466, ICQ 13349191 |Wear short sleeves! Support your right to
http://nadav.harel.org.il   |bare arms!
___
Haifux mailing list
Haifux@haifux.org
http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux


Re: [Haifux] Implementing read() like UNIX guys like it

2011-04-26 Thread Eli Billauer

Hello,


Regarding your previous mail, we agree, except that I will implement the 
"wait a little" thing. Believe me, after implementing the hardware part, 
kernel programming is a breeze.



Nadav Har'El wrote:

The drawback of doing this 
exactly like this, is that if data arrives at a slow rate (say, 100 
kB/sec) it's likely that every read() operation will yield one byte of 
data, making the CPU spin around this instead of doing something useful.


(...) Second, if the CPU *did* have "something useful" to do (run other 
processes,
or whatever), it would, causing a bit more time to pass between the read()
and it might return more than one byte. It will only return one byte when
there's nothing better to do than calling read() all the time.
  
That's an interesting point, but I'm not 100% sure on that one. Why 
would the scheduler take the CPU away, if the read() operation always 
returns one byte or two? It would, eventually take it for a while, which 
would let data stack up, but each process would get its fair slice.


I'm not saying this would cause some real degradation, but on a slow 
embedded processor, seeing 10% CPU usage on a process which is supposed 
to just read data...? The calculation is a 200 MHz processor, 100 kB/sec 
and 200 CPU clocks for each byte being processed on itself.



--
Web: http://www.billauer.co.il

___
Haifux mailing list
Haifux@haifux.org
http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux


[Haifux] Fwd: Job offer: Python developer

2011-04-26 Thread Zaar Hai
Good day everyone!
I've thought that posting job offers just before Passover is not a
very good idea. So I'd like to repost this job offer. I hope it will
not go against the list rules.


-- Forwarded message --
From: Zaar Hai 
Date: Sat, Apr 16, 2011 at 10:06 PM
Subject: Job offer: Python developer
To: Haifa Linux Club 


Good evening everyone!
ForNova, the leader in web data aggregation, is looking for an
experienced Python Developer to join the Deployment team (full time
job). We are located in Yokneam and looking for a person to fit the
following job requirements:

* Fluency with Python
* Hands on with MySQL and Linux
* Experience in Web-related development
* Entrepreneurial spirit
* Good time management skills and capability to work in highly agile environment
Please submit your CV to jobs at fornova dot net. Please mark
[HAIFUX] in the subject.
Regards,
--
Zaar



-- 
Zaar
___
Haifux mailing list
Haifux@haifux.org
http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux


Re: [Haifux] Implementing read() like UNIX guys like it

2011-04-26 Thread Nadav Har'El
On Sat, Apr 23, 2011, Eli Billauer wrote about "Re: [Haifux] Implementing 
read() like UNIX guys like it":
> >if the user calls "read" and there is data - return what you have to the
> >user without blocking.
> That is one of the options I considered. The drawback of doing this 
> exactly like this, is that if data arrives at a slow rate (say, 100 
> kB/sec) it's likely that every read() operation will yield one byte of 
> data, making the CPU spin around this instead of doing something useful.

Two things make this a non-issue (or at least a not-very-important issue).

First, if the data arrives at a slow rate, it doesn't really matter how
optimized your code is - nobody will notice the difference anyway.

Second, if the CPU *did* have "something useful" to do (run other processes,
or whatever), it would, causing a bit more time to pass between the read()
and it might return more than one byte. It will only return one byte when
there's nothing better to do than calling read() all the time.

-- 
Nadav Har'El|  Tuesday, Apr 26 2011, 22 Nisan 5771
n...@math.technion.ac.il |-
Phone +972-523-790466, ICQ 13349191 |Open your arms to change, but don't let
http://nadav.harel.org.il   |go of your values.
___
Haifux mailing list
Haifux@haifux.org
http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux


Re: [Haifux] Implementing read() like UNIX guys like it

2011-04-26 Thread Nadav Har'El
On Fri, Apr 22, 2011, Eli Billauer wrote about "[Haifux] Implementing read() 
like UNIX guys like it":
> Now the dilemma: Suppose that the read() method was called with a requested 
> byte count of 512. The driver checks its internal buffer, and discovers 
> that it can supply 10 bytes right away and return, or it can block and wait 
> until it has all the 512, and return only when the request is completed 
> fully. The hardware data source is streaming, with no guarantee if and when 
> this data will arrive.
> 
> Or, it can try waiting a little (what is "a little"?), and then time out, 
> returning with whatever it has got (�  la TCP/IP).
> 
> I suppose all three possibilities are legal. The question is what will work 
> most naturally.

>From what I understand, your device resembles a pipe - a stream of bytes that
come from some external source, and you have no idea when they will come.

In that case, the most "natural" behavior is the behavior of regular Unix
pipes, i.e., when some data is available, a read(2) returns with possibly
a lower number of bytes read than requested. This basically means that your
options #1 (return what you have) and #3 ("wait a little") are recommended -
but option #1 seems the easiest to implement and I'd therefore recommend
against the more complex #3 unless you're sure this will give you some sort
of performance win (I don't know how much performance tweaking actually
matters in your use-case).

Option #2 (blocking until all requested bytes are available) is bad, and
un-Unix-like: It means that a program, e.g., cat(1), which uses stdio
(fgets, scanf, etc.) may issue a read(fd, buf, BUFSIZE) and then block
until BUFSIZE bytes are available; This may make sense for files on hard
disks, but sort of buffering is unexpected on pipes, and probably on your
device as well.

> relevant file descriptor to behave as one would expect. For example, a user 
> may choose to read from the file descriptor with scanf or fgets. The user 
> would expect these to return whenever sufficient data has been fed into the 

This is why I recommeneded against option #2.

> hardware side. On the other hand, if the data comes slowly into hardware, 
> read()'s will return with one byte at a time, which is maybe not desirable 
> either.

If the device so slow that reads return one byte at a time, the performance
impact of reading them one by one is probably not important. On a quick
device, by the time you next read() from it probably more than one character
will be ready. This will give you the best of both worlds: Low latency when
the device is slow, and high throughput when it is fast.

-- 
Nadav Har'El|  Tuesday, Apr 26 2011, 22 Nisan 5771
n...@math.technion.ac.il |-
Phone +972-523-790466, ICQ 13349191 |"Outlook not so good." Wow! That magic 8-
http://nadav.harel.org.il   |ball knows everything! So, what about IE?
___
Haifux mailing list
Haifux@haifux.org
http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux