Linux-Development-Apps Digest #383, Volume #7     Mon, 7 May 01 15:13:13 EDT

Contents:
  Re: How to get a number of processors (John Hasler)
  split a file into multi-files ("Eric Chow")
  Re: How to get a number of processors (Stefaan A Eeckels)
  Sockets: mixing sync and async (Bruno Barberi Gnecco)
  Re: Handling large numbers of sockets? (Steve Connet)
  mysql & threads (Steve Connet)
  Re: mozilla browser ID (Toby Haynes)
  Re: Help - Edit route table in software? ("MR")
  Re: How to get a number of processors (John Beardmore)
  Re: How to get a number of processors (John Beardmore)
  Re: How to get a number of processors (Roberto Nibali)
  Re: How to get a number of processors (Nix)
  Re: How to get a number of processors (Greg Copeland)
  Re: How to get a number of processors (John Beardmore)

----------------------------------------------------------------------------

From: John Hasler <[EMAIL PROTECTED]>
Subject: Re: How to get a number of processors
Date: Mon, 7 May 2001 12:08:17 GMT

Floyd Davidson writes:
> Now try this command:

>  make -j3 MAKE="make -j3" bzImage

And for even more fun try

  make -j MAKE="make -j" bzImage

Be sure you have lots of RAM, though.
-- 
John Hasler
[EMAIL PROTECTED] (John Hasler)
Dancing Horse Hill
Elmwood, WI

------------------------------

From: "Eric Chow" <[EMAIL PROTECTED]>
Subject: split a file into multi-files
Date: Mon, 7 May 2001 21:31:52 +0800

Hello,

Would you please to teach me how can I split a file into different files in
shell script?

For example,

index.dat
========
001   12
002   25
003   08
004   12
005   25
006   08
007   02


content.dat
=========
001 aaaaaa AAAAA .....
001 bbbbb BBBBBB .....
001 cccccc CCCCCCC ....... some other contents
002 .... another content for 002 ....
002 ... 002 datas ...
003 ....... 003 .....
003 .. This is another 003 data ...
004 ..... 004 ....
005 ... 005 ...
006 .... 006 ....
007 ... 007 1 ...
007 ... 007 2 ...

result-12.dat
==========
aaaaaa AAAAA .....
bbbbb BBBBBB .....
cccccc CCCCCCC ....... some other contents
..... 004 ....

result-02.dat
==========
... 007 1 ...
... 007 2 ...


result-08.dat
==========
....... 003 .....
 .. This is another 003 data ...


result-25.dat
==========
.... another content for 002 ....
... 002 datas ...
... 005 ...


As the above example, there are two files. One is "index.dat" and another
one is "content.dat". Would you please to teach me how to write a
shell-script to produce the another 4 files, "result-12.dat",
"result08.dat", "result-02.dat" and "result25.dat" ?

In the "index.dat", the first column is a Sort-Key, and the second column is
Group-Key. In those result files, all the contents will be the combination
of Group-Key.

Let's see the "result12.dat", we can see that the Sort-Key "001" and "004"
contains the same Group-Key "12", so the content of the result file
"result12.dat" contains all the lines of Sort-Key "001" and all the lines of
Sort-Key "004".

Since I am fair in shell-script, would you please to show me a simple
example to do that ?

Best regards,
Eric








------------------------------

From: [EMAIL PROTECTED] (Stefaan A Eeckels)
Subject: Re: How to get a number of processors
Crossposted-To: comp.os.linux.development.system
Date: Mon, 7 May 2001 15:19:55 +0200

In article <[EMAIL PROTECTED]>,
        Greg Copeland <[EMAIL PROTECTED]> writes:
> [EMAIL PROTECTED] (Eric P. McCoy) writes:
> 
>> Greg Copeland <[EMAIL PROTECTED]> writes:
>> 
>> > Oddly enough, my man page for sysconf doesn't show the _SC_NPROCESSORS_CONF
>> > option.  Hmm....wonder how long it's been around.  
>> 
>> It starts with an underscore, which probably means it's nonstandard.
>> I'd be a little stronger and say that if it's not in the man page,
>> you also shouldn't use it.
>> 
>> This strikes me as a battle of bad ideas: I hate writing a text parser
>> to deal with /proc; I don't like using nonstandard pieces of code; and
>> no program should ever need to know how many processors are in a given
>> box.  There are cases where you'd want to use one or all of these bad
>> ideas, but I, for one, would need a pressing reason.
>> 
> 
> I strongly disagree with the assertion that no one would ever need to know
> how many processors there are in a system.  I have worked on some large UNIX
> boxes and needed to know how may there were.  I took the short path made it
> a user determined value, but making it dynamic would of been nicer.  BTW, it
> was for parallelizing some computations and the number parallel index
> optimizations that could be run at a single time while still allowing for
> free CPU's to handle from other batch jobs.  In short, if I exceeded more
> than 75% of the CPU's (rounded down), the batch jobs suffered too heavily.  As
> you can see, making it dynamic would of been nice since it had to run on
> several different size boxes (4 and 8 CPU machines).  Keep in mind, some of
> these batch jobs would run for 8-18 hours.  If I took the other CPU's,
> it would force them to run over their maximum run-time window, which was
> 18-hours.  If I didn't take enough, I wouldn't finish in the required
> window.  I would of rather had someone be able to say 50% or 75% of the CPUs
> should go here, than a hard number, as I did do, which required unique config
> files on each system.  At any rate, I don't think it's wise to assert such a
> rule should exist.

Maybe the OS should provide a service to ensure that certain
processes get a minimum of CPU time. The approach you describe
is a hack, and goes completely against the basic Unix concept
of hiding the hardware differences and specifics from the 
applications. 

-- 
Stefaan
-- 
How's it supposed to get the respect of management if you've got just
one guy working on the project?  It's much more impressive to have a
battery of programmers slaving away. -- Jeffrey Hobbs (comp.lang.tcl)

------------------------------

From: Bruno Barberi Gnecco <[EMAIL PROTECTED]>
Subject: Sockets: mixing sync and async
Date: Mon, 07 May 2001 10:30:56 -0300
Reply-To: [EMAIL PROTECTED]

        I'm working on an application that uses sockets for communications.
It has to support asynchronous communication (some packets may arrive at
any time) but there's a small part, used for synchronization, where it 
has to be synchronous.
        I thought two solutions:
* using signals to handle async. When it comes to the sync part, disable
them, and then reenable them.
* using a thread+select for the async part, and a separate socket for the
sync part.

        Problems: in the first, async packets could arrive at the beginning
of the sync function, and could mix up; and signals are likely to give some
headache. The second one has the usual thread problems. 
        So, any suggestions?

-- 
Bruno Barberi Gnecco <[EMAIL PROTECTED]>
http://www.geocities.com/RodeoDrive/1980/
Quoth the Raven, "Nevermore". - Poe

------------------------------

Subject: Re: Handling large numbers of sockets?
From: Steve Connet <[EMAIL PROTECTED]>
Date: Mon, 07 May 2001 15:05:46 GMT

[EMAIL PROTECTED] (Kaz Kylheku) writes:

> It's not hard to add or remove sockets; to nuke something from the
> poll array, just move the last element to that position.  To add a
> socket, add it beyond the last element.  In some programs, it's
> useful for some application data structure to have a back pointer to
> the struct pollfd; when you move it around this way within the
> array, you have to be sure to find and update that pointer.

I am interested in using poll() instead of select() as well. My
question is, when I disconnect a client, do I have to remove the
associated poll struct and reindex the poll struct array? Or is there
a quick fix? I heard something about setting the fd in the struct to 0
to have poll() ignore it? If so, then I can reindex the poll struct
array when the server is idle. Can you confirm this? How do you deal
with disconnected clients as far as the poll struct array goes?

-- 
Steve Connet            Remove USENET to reply via email
[EMAIL PROTECTED]

------------------------------

Subject: mysql & threads
From: Steve Connet <[EMAIL PROTECTED]>
Date: Mon, 07 May 2001 15:07:14 GMT

In my C program, can I have multiple threads performing mysql_query's
on the same connection? In other words, is it safe to do queries
concurrently on the same connection? Or should each thread have it's
own connection? I couldn't find this info in the mysql manual or by
doing a web search, so I'm hoping someone on usenet will know.

-- 
Steve Connet            Remove USENET to reply via email
[EMAIL PROTECTED]

------------------------------

From: Toby Haynes <[EMAIL PROTECTED]>
Subject: Re: mozilla browser ID
Date: 07 May 2001 11:24:36 -0400

On 4 May 2001, [EMAIL PROTECTED] wrote:

> Toby Haynes ([EMAIL PROTECTED]) wrote:
>: On 2 May 2001, [EMAIL PROTECTED] wrote:
> 
>: > 
>: > Does anyone know where in the mozilla source is the part that IDs the
>: > browser when a web page asks.  I would like to change mine to windows and
>: > to ns 4.01.  Does anyone know where in that labrinth of code this is?  I
>: > tried to grep it and I came up blank.  I want to do this b/c I don't want
>: > to be protected from myself when I try to run certain java applets.
> 
>: You don't need to change the source - just add a line to your user
>: preferences. There is an awful lot of black magic available in the user
>: prefs that aren't available through the GUI yet.
> 
>: // Override the default user-agent string:
>: user_pref("general.useragent.override", "Mozilla/5.0 (X11; U; Linux
>: 2.2.16-22smp i686; en-US; m18) Gecko/20010110 Netscape6/6.5");
> 
> Thanks, now we are getting somewhere.  I would like to change my OS to
> windows, though, I'm changine the Netscape6/6.5 to Netscape4.51.
> 
> Anywhere i need to specify win95?

Hmmm - I think the normal Netscape 4.75 ID looks like this

user_pref("general.useragent.override", "Mozilla/4.75 [en] (Win95; U)"); 

Munge as you see fit :-)

Cheers,
Toby Haynes


-- 

Toby Haynes
The views and opinions expressed in this message are my own, and do
not necessarily reflect those of IBM Canada.

------------------------------

From: "MR" <[EMAIL PROTECTED]>
Crossposted-To: comp.os.linux.development.system,comp.os.linux.networking
Subject: Re: Help - Edit route table in software?
Date: Wed, 2 May 2001 19:12:10 +0200


"Sebastien Jean" <[EMAIL PROTECTED]> wrote in message
news:9c56b2$7ac$[EMAIL PROTECTED]...
> Changing the routing table from the command line is easy.  Does anyone
know
> how to change the routing table from within a program written in C/C++?
> What about changing Ethernet device parameters.
>
Try the libiptc library shipped with iptables 1.2.1.

Bye,
 MR



------------------------------

From: John Beardmore <[EMAIL PROTECTED]>
Crossposted-To: comp.os.linux.development.system
Subject: Re: How to get a number of processors
Date: Mon, 7 May 2001 16:45:10 +0100

In message <[EMAIL PROTECTED]>, Stefaan A Eeckels 
<[EMAIL PROTECTED]> writes
>In article <[EMAIL PROTECTED]>,
>       [EMAIL PROTECTED] (Dave Blake) writes:
>> Eric P. McCoy <[EMAIL PROTECTED]> wrote:
>>
>>> This strikes me as a battle of bad ideas: I hate writing a
>>> text parser to deal with /proc; I don't like using nonstandard
>>> pieces of code; and no program should ever need to know how
>>> many processors are in a given box.  There are cases where
>>> you'd want to use one or all of these bad ideas, but I, for
>>> one, would need a pressing reason.
>>
>> Suppose I am writing a data crunching piece of software that
>> parallelizes easily, and wish to run a thread on each processor.
>>
>> I first parse /proc/stat, and then crunch away with a thread
>> on each CPU.
>>
>> For a web searching program, you may wish to know the number of
>> NICs and CPUs, and take the lower of the two as the number of
>> threads to run. And so on.
>
>In a Unix system, the application should not need to know
>anything about the hardware details.

<dOGMA aLERT !>

Well then, in that case, why don't YOU start the project to make gcc 
exploit all possible opportunities for parallelism ?


> The recent obsession
>with threads violates that basic tenet.

You sound if you find threads morally objectionable as opposed to 'just 
another way to get the job done'.


> If one wants to
>squeeze the last ounce of performance from a box,

But it's not the 'last ounce' !  On some boxes it's most of the ounces !


> don't
>use an OS.

Oh balls.

     'Use an OS and a compiler that knows about parallelism'

might be a better assertion, but pausing briefly to live in the real 
world, C is not terribly 'parallel aware', and gcc is what most us here 
want to work with.

I don't see any moral problem with C and Linux supporting threads.  Even 
on single CPU machines this can speed up some IO operations with 
simultaneous reads on more than one device for example.  I really don't 
see the point in coming over all 'closed minded' about it.

Now if you're going to make the number of threads equal the number of 
processors for some crunching task, WTF is wrong with the OS making that 
info available ??  It's only one integer and nobody's forcing YOU to use 
it if you object on religious grounds.


Cheers, J/.
-- 
John Beardmore

------------------------------

From: John Beardmore <[EMAIL PROTECTED]>
Crossposted-To: comp.os.linux.development.system
Subject: Re: How to get a number of processors
Date: Mon, 7 May 2001 16:47:19 +0100

In message <[EMAIL PROTECTED]>, Stefaan A Eeckels 
<[EMAIL PROTECTED]> writes

>Maybe the OS should provide a service to ensure that certain
>processes get a minimum of CPU time. The approach you describe
>is a hack, and goes completely against the basic Unix concept
>of hiding the hardware differences and specifics from the
>applications.

Guaranteeing resources to processes may be a good tool, but it isn't an 
alternative to knowing how many threads to create for a big crunching 
job.  For that, you still need to know the number of available 
processors.


Cheers, J/.
-- 
John Beardmore

------------------------------

From: Roberto Nibali <[EMAIL PROTECTED]>
Crossposted-To: comp.os.linux.development.system
Subject: Re: How to get a number of processors
Date: Mon, 07 May 2001 18:03:34 +0200

> I *think* this is new in glibc2.2.  If you don't have this sysconf, you can do
> the C equivalent of:
> 
>     grep processor < /proc/cpuinfo | wc -l
> 
> :)

<completely OT>

On behalf of Randal L. Schwartz:

UUOC: grep -c processor /proc/cpuinfo
Just another Useless Use of Usenet,

Roberto Nibali, ratz

</completely OT>

-- 
mailto: `echo [EMAIL PROTECTED] | sed 's/[NOSPAM]//g'`

------------------------------

From: Nix <$}xinix{$@esperi.demon.co.uk>
Crossposted-To: comp.os.linux.development.system
Subject: Re: How to get a number of processors
Date: 07 May 2001 17:18:56 +0100

On Mon, 7 May 2001, John Beardmore yowled:
> Guaranteeing resources to processes may be a good tool, but it isn't
> an alternative to knowing how many threads to create for a big
> crunching job.  For that, you still need to know the number of
> available processors.

<pedant>
No, you need to know the number of simultaneously executing tasks, which
is quite a different thing (or it could be, on high-end systems).

The number of physical lumps of doped silicon inside the machine is an
irrelevant hardware detail.
</pedant>

-- 
`I like to think the situation could be likened to doing heart surgery
 in a hurry with a plastic spoon after having a couple of pints.'
                                                       --- James Reeves

------------------------------

Crossposted-To: comp.os.linux.development.system
Subject: Re: How to get a number of processors
From: Greg Copeland <[EMAIL PROTECTED]>
Date: 07 May 2001 12:37:38 -0500

[EMAIL PROTECTED] (Stefaan A Eeckels) writes:

> In article <[EMAIL PROTECTED]>,
>       Greg Copeland <[EMAIL PROTECTED]> writes:
> > 
> > I strongly disagree with the assertion that no one would ever need to know
> > how many processors there are in a system.  I have worked on some large UNIX
> 
> Maybe the OS should provide a service to ensure that certain
> processes get a minimum of CPU time. The approach you describe
> is a hack, and goes completely against the basic Unix concept
> of hiding the hardware differences and specifics from the 
> applications. 

I *completely* disagree with you.  You are unable to get past the simple fact that
parallel computing has requirements above and beyond simple server programming
where you are using a pool of processes/threads or even a 1:1 process/thread per
client.  Actually, let's talk about those for a second.  If you are no longer
using a 1:1 model of process/thread per client, then it's safe to assume you've
come to an scalability impasse where you decided that a pool of resources will
scale better.  Why do you supposed that you hit the wall and was forced to change
models?  In the above cases, you are assuming that you are the only processes with
significant priority on it, as obviously, a heavily loaded system will, by far,
not be servicing clients and other applications fairly.  Once you realize that
you may need to service a mix of these types of applications, you are once again
forced to adopt another model.  This is hardly a hack.  This is the real world.
Now then, *my* implemention is somewhat of a hack, but simply because the
information was not readily available.  Keep in mind, this is one such reason
why some OS's provide facilities for process afinity which would simply allow
a process to run, for example, on the first 4 processors and leave the other
four free reguardless of how many children or threads the parent makes.

I spelled out, using a real world situation why such a mechanism needs to exist.
You simply said it's a hack and violates some made up "tenet".  Please tell me
how you would solve it.  Keep in mind this was on a project that was three years
and 2-million overdue and long delays of yet *another* complex system would
more than likely result in the rolling of your head and/or arse.  Even without
such constraints, I'd like to see your magical solution.  Saying the concept
is a hack, which is clearly required, without any supportive evidence seems
a pretty cheap way out to me.

Greg


-- 
Greg Copeland, Principal Consultant
Copeland Computer Consulting
==================================================
PGP/GPG Key at http://www.keyserver.net
DE5E 6F1D 0B51 6758 A5D7  7DFE D785 A386 BD11 4FCD
==================================================

------------------------------

From: John Beardmore <[EMAIL PROTECTED]>
Crossposted-To: comp.os.linux.development.system
Subject: Re: How to get a number of processors
Date: Mon, 7 May 2001 19:31:44 +0100

In message <[EMAIL PROTECTED]>, Nix 
<$}xinix{$@esperi.demon.co.uk> writes
>On Mon, 7 May 2001, John Beardmore yowled:

>> Guaranteeing resources to processes may be a good tool, but it isn't
>> an alternative to knowing how many threads to create for a big
>> crunching job.  For that, you still need to know the number of
>> available processors.
>
><pedant>
>No, you need to know the number of simultaneously executing tasks, which
>is quite a different thing (or it could be, on high-end systems).

Well what you really want to do is match the number of high priority 
threads to processors.


>The number of physical lumps of doped silicon inside the machine is an
>irrelevant hardware detail.

Nonsense.


Cheers, J/.
-- 
John Beardmore

------------------------------


** FOR YOUR REFERENCE **

The service address, to which questions about the list itself and requests
to be added to or deleted from it should be directed, is:

    Internet: [EMAIL PROTECTED]

You can send mail to the entire list by posting to the
comp.os.linux.development.apps newsgroup.

Linux may be obtained via one of these FTP sites:
    ftp.funet.fi                                pub/Linux
    tsx-11.mit.edu                              pub/linux
    sunsite.unc.edu                             pub/Linux

End of Linux-Development-Apps Digest
******************************

Reply via email to