git session at BAFUG tomorrow (San Francisco Bay Area FreeBSD User Group).

2013-10-09 Thread Alfred Perlstein
I've been doing a few one-on-one sessions at iXsystems explaining the 
git model to developers and have had much success.


Tomorrow at BAFUG 
(http://www.meetup.com/BAFUG-Bay-Area-FreeBSD-User-Group/events/144351492/) 
I will be doing a quick talk on git and then doing a breakaway session 
on managing large projects using git.



This week Alfred Perlstein will have a GIT talk for FreeBSD users, and 
offer a 1-2 hour demo in a break awayfor people interested in doing a 
hands-on GIT experience managing a large project.


We will cover migration of your FreeBSD customizations based on one 
version of FreeBSD to another FreeBSD version. Specifically, Alfred 
will be doing a hands-on using rebasing, to do this migration.


To participate, all that needs to be done is to have run git clone 
https://github.com/freebsd/freebsd; beforehand (as this takes about 30 
minutes).


In addition, a computer with wi-fi or some internet connection to 
github.com http://github.com/ is needed.


As a prerequisite, it is expected that you have some understanding of 
patch(1) and diff(1).


There will be no recording allowed and the format will allow only for 
short discussion and no derailing. Discussion of other SCM tools is 
only appropriate in the context of understanding what we are doing in 
git. Advocacy of other SCM tools will not be tolerated.  If you can 
not abide by these rules you will be asked to leave the session.





The workshop will mostly focus on git rebase of a large project, 
however we may cover a few other areas depending on how much progress we 
make in the 1-2 hours we set aside.


Again, if you want to participate, show up on time and make sure you 
have a git clone of freebsd already made!


-Alfred

--
Alfred Perlstein

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [SPAM] Announcing: nuOS 0.0.9.1b1 - a whole NEW FreeBSD distro, NOT a fork

2013-07-07 Thread Alfred Perlstein

On 7/7/13 3:05 PM, C. Bergström wrote:

On 07/ 8/13 04:58 AM, Chad J. Milios wrote:

snip


Outline of features:

Extends plain old FreeBSD 9.1 (RELEASE or STABLE) and maintains total 
compatibility

We seek to remain nimble
Expect a production-ready seal of approval to lag behind releases 
by no more than a week or two

and prebuilt images and packages
e.g. releases like 9.2 and 10.0, et al
Someone should be able to build it and use all applicable 
features on 8.4 with ease

we simply haven't the time or inclination to even try
Default full ZFS filesystem layout, completely legacy-free
Boot from ZFS, boot to ZFS
If you'd like use all 100.0% of all your drives for one large 
zpool

Use one large zpool for all of your
filesystems
block volumes
alternate boot environments, including one called 
rescue which is included

NO partitions, not some tiny /, not even a /boot
Just ZFS datasets in their infinite flexibility
/etc is now a ZFS dataset of its own
How did we do it?
Decades of conventional wisdom says /etc must be 
on /.

Check it out, discuss the whys and the trade-offs.
nu_jail - provision all sorts of jails
No guesswork
Yet no cookie-cutter limitations
Clean-room jails provisioned almost instantly
ZFS clone of /etc and /var give you almost no storage overhead
nullfs and/or unionfs mounts of /, /usr, /usr/local give you 
almost no memory overhead

Run 1,000 jails and 10,000 Apache instances
they safely access the same executable memory pages
they securely know not of one-another's existence
Advanced intra-host networking with VIMAGE kernel by default, 
simplified
Made for developers who want robustness, power and flexibility 
streamlined for
Unlimited development, testing, staging and production 
environments

Uses all of the new jail and vnet features of FreeBSD 9.1
We cleaned out all of the cruft left over from earlier versions

trolling side comment
omg you've created Solaris
/trolling side comment

If you're going to spam commercial stuff with absolutely no 
technically interesting details - please keep it brief at the least.


Generally people will be curious about
What are you actually adding to the ISO which FBSD-current can't do? 
If it's not upstream already - will it be contributed upstream?


It seems pretty obvious to me that the contribution is that all this 
stuff works out of the box.  That is pretty nice.





--
Alfred Perlstein
VP Software Engineering, iXsystems

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: please review, patch for lost camisr

2013-05-29 Thread Alfred Perlstein

On 5/29/13 12:16 AM, Konstantin Belousov wrote:

On Tue, May 28, 2013 at 10:31:40PM -0700, Alfred Perlstein wrote:

On 5/28/13 10:08 PM, Konstantin Belousov wrote:

On Tue, May 28, 2013 at 01:35:01PM -0700, Alfred Perlstein wrote:

[[  moved to hackers@ from private mail. ]]

On 5/28/13 1:13 PM, John Baldwin wrote:

On Tuesday, May 28, 2013 3:29:41 pm Alfred Perlstein wrote:

On 5/28/13 9:04 AM, John Baldwin wrote:

On Tuesday, May 28, 2013 2:13:32 am Alfred Perlstein wrote:

Hey folks,

I had a talk with Nathan Whitehorn about the camisr issue.  The issue we
are seeing we mostly know, but to summarize, we are losing the camisr
signal and the camisr is not being run.

I gave him a summary of what we have been seeing and pointed him to the
code I am concerned about here:
http://pastebin.com/tLKr7mCV  (this is inside of kern_intr.c).

What I think that is happening is that the setting of it_need to 0
inside of sys/kern/kern_intr.c:ithread_loop() is not being scheduled
correctly and it is being delayed until AFTER the call to
ithread_execute_handlers() right below the atomic_store_rel_int().

This seems highly unlikely, to the extent that if this were true all our
locking primitives would be broken.  The store_rel is actually a release
barrier which acts like more a read/write fence.  No memory accesses (read or
write) from before the release can be scheduled after the associated store,
either by the compiler or CPU.  That is what Konstantin is referring to in his
commit when he says release semantics.

Yes, that makes sense, however does it specify that the writes *must*
occur at that *point*?  If it only enforces ordering then we may have
some issue, specifically because the setting of it to '1' inside of
intr_event_schedule_thread has no barrier other than the acq semantics
of the thread lock.  I am wondering what is forcing out the '1' there.

Nothing ever forces writes.  You would have to invalidate the cache to do that
and that is horribly expensive.  It is always only about ordering and knowing
that if you can complete another operation on the same cookie variable with
acquire semantics that earlier writes will be visible.

By cookie, you mean a specific memory address, basically a lock? This is
starting to reinforce my suspicions as the setting of it_need is done
with release semantics, however the acq on the other CPU is done on the
thread lock.  Maybe that is irrelevant.  We will find out shortly.


See below as I think we have proof that this is somehow happening.

Having ih_need of 1 and it_need of 0 is certainly busted.  The simplest fix
is probably to stop using atomics on it_need and just grab the thread lock
in the main ithread loop and hold it while checking and clearing it_need.


OK, we have some code that will either prove this, or perturb the memory
ordering enough to make the bug go away, or prove this assertion wrong.

We will update on our findings hopefully in the next few days.

IMO the read of it_need in the 'while (ithd-it_need)' should
have acquire semantic, otherwise the future reads in the
ithread_execute_handlers(), in particular, of the ih_need, could pass
the read of it_need and cause the situation you reported.  I do not
see any acquire barrier between a condition in the while() statement
and the read of ih_need in the execute_handlers().

It is probably true that the issue you see was caused by r236456, in the
sense that implicitely locked xchgl instruction on x86 has a full barrier
semantic.  As result, the store_rel() was actually an acquire too, making
this reordering impossible.  I argue that this is not a bug in r236456,
but the issue in the kern_intr.c.

If I remember the code correctly that would probably explain why we see
it only on 9.1 system.

On the other hand, the John' suggestion to move the manipulations of
it_need under the lock is probably the best anyway.


I was wondering if it would be lower latency to maintain it_need,
however to keep another variable it_needlocked under the thread lock.
This would result in potential superfluous interrupts, however under
load you would allow the ithread to loop without taking the thread lock
some number of times.

I am not really sure if this is really worth the optimization
(especially since it can result in superfluous interrupts) however it
may reduce latency and that might be important.

Is there some people that I can pass the patch onto for help with
performance once we confirm that this is the actual bug?   We can do
internal testing, but I am worried about regressing performance of any
form of IO for the kernel.

I'll show the patch soon.

Thank you for the information.  This is promising.

Well, if you and I are right, the minimal patch should be

diff --git a/sys/kern/kern_intr.c b/sys/kern/kern_intr.c
index 8d63c9b..7c21015 100644
--- a/sys/kern/kern_intr.c
+++ b/sys/kern/kern_intr.c
@@ -1349,7 +1349,7 @@ ithread_loop(void *arg)
 * we are running, it will set it_need to note that we

Re: FreeBSD installers and future direction

2013-05-28 Thread Alfred Perlstein

On 5/28/13 7:49 AM, Nathan Whitehorn wrote:

On 05/27/13 23:36, Alfred Perlstein wrote:

On 5/27/13 6:53 PM, Nathan Whitehorn wrote:

On 05/27/13 20:40, Alfred Perlstein wrote:

On 5/27/13 2:23 PM, Bruce Cran wrote:

On 27/05/2013 21:28, Alfred Perlstein wrote:

On 5/27/13 11:40 AM, Bruce Cran wrote:

Yes.

Is this a joke?


It probably /was/ too short a reply. Personally I think there 
should be a single UI and scripting interface across all 
platforms. We should try and get pc-sysinstall running on all of 
them first in case there's some problem that means it can't be 
done, in which case we'd need to use a different backend.




There are just going to be certain platforms that make it EASY to 
do cool things.  We should embrace that!  That's why there are 
different platforms!


Some are great for low power, others are great for graphics, cpu 
power, gpu, networking etc.


If we always go for the lowest common denominator then we are 
crippling all the platforms for no one's benefit.  Even if 
something CAN be done, if it is very difficult, or just never 
happening, then we can't limit everyone's experience based on the 
more difficult and/or resource strapped platforms.


It's just not good business.


Yes, and all of this cuts both ways: pc-sysinstall has no wireless 
setup support, for instance. Right now we support what we support 
because it is the most feature-complete thing we have, not just on 
tier-2 platforms but also on x86.


To bring this discussion back to the ground, the fact is that we 
lack an installer that has both internal support for ZFS and a UI. 
One of the reasons for this is that making a good expressive UI for 
ZFS is a non-trivial undertaking given its enormous flexibility. The 
bsdinstall partition editor has been written to be extensible for 
this, and several people have started writing code to do it, but no 
one ended up having time to finish. Probably a reasonable thing to 
do is to start with supporting only a minimal set of features. If 
anyone felt like actually writing this code, I'm sure it would be 
appreciated by all and be more productive than email exchanges.

-Nathan


I'm sure if there was a list of reasonable things, such as wireless 
then pc-sysinstall could be augmented.  This is the first I've heard 
of that.  All the other complaints have been based on portability.


Is that all that is required now, wireless?


There are more, as well. A partial list of missing features on both 
sides is here: https://wiki.freebsd.org/PCBSDInstallMerge. Other major 
ones are IPv6 (maybe this has changed?) and no jail setup feature. 
Most of the existing disk partitioning code in pc-sysinstall, which is 
the only thing in a FreeBSD installer that is at all complicated, is 
also *extremely* fragile and needs in all likelihood to be entirely 
replaced. The merge effort stalled because of this kind of issue -- 
doing a merge rapidly became rewriting both from scratch.

-Nathan


Ah this is so cool.  I'll bring it up with the PCBSD folks today.

Thank you Nathan.

-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: please review, patch for lost camisr

2013-05-28 Thread Alfred Perlstein

[[  moved to hackers@ from private mail. ]]

On 5/28/13 1:13 PM, John Baldwin wrote:

On Tuesday, May 28, 2013 3:29:41 pm Alfred Perlstein wrote:

On 5/28/13 9:04 AM, John Baldwin wrote:

On Tuesday, May 28, 2013 2:13:32 am Alfred Perlstein wrote:

Hey folks,

I had a talk with Nathan Whitehorn about the camisr issue.  The issue we
are seeing we mostly know, but to summarize, we are losing the camisr
signal and the camisr is not being run.

I gave him a summary of what we have been seeing and pointed him to the
code I am concerned about here:
http://pastebin.com/tLKr7mCV  (this is inside of kern_intr.c).

What I think that is happening is that the setting of it_need to 0
inside of sys/kern/kern_intr.c:ithread_loop() is not being scheduled
correctly and it is being delayed until AFTER the call to
ithread_execute_handlers() right below the atomic_store_rel_int().

This seems highly unlikely, to the extent that if this were true all our
locking primitives would be broken.  The store_rel is actually a release
barrier which acts like more a read/write fence.  No memory accesses (read or
write) from before the release can be scheduled after the associated store,
either by the compiler or CPU.  That is what Konstantin is referring to in his
commit when he says release semantics.

Yes, that makes sense, however does it specify that the writes *must*
occur at that *point*?  If it only enforces ordering then we may have
some issue, specifically because the setting of it to '1' inside of
intr_event_schedule_thread has no barrier other than the acq semantics
of the thread lock.  I am wondering what is forcing out the '1' there.

Nothing ever forces writes.  You would have to invalidate the cache to do that
and that is horribly expensive.  It is always only about ordering and knowing
that if you can complete another operation on the same cookie variable with
acquire semantics that earlier writes will be visible.


By cookie, you mean a specific memory address, basically a lock? This is 
starting to reinforce my suspicions as the setting of it_need is done 
with release semantics, however the acq on the other CPU is done on the 
thread lock.  Maybe that is irrelevant.  We will find out shortly.





See below as I think we have proof that this is somehow happening.

Having ih_need of 1 and it_need of 0 is certainly busted.  The simplest fix
is probably to stop using atomics on it_need and just grab the thread lock
in the main ithread loop and hold it while checking and clearing it_need.



OK, we have some code that will either prove this, or perturb the memory 
ordering enough to make the bug go away, or prove this assertion wrong.


We will update on our findings hopefully in the next few days.

Thank you for your advice.

-Alfred





-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: please review, patch for lost camisr

2013-05-28 Thread Alfred Perlstein

On 5/28/13 10:08 PM, Konstantin Belousov wrote:

On Tue, May 28, 2013 at 01:35:01PM -0700, Alfred Perlstein wrote:

[[  moved to hackers@ from private mail. ]]

On 5/28/13 1:13 PM, John Baldwin wrote:

On Tuesday, May 28, 2013 3:29:41 pm Alfred Perlstein wrote:

On 5/28/13 9:04 AM, John Baldwin wrote:

On Tuesday, May 28, 2013 2:13:32 am Alfred Perlstein wrote:

Hey folks,

I had a talk with Nathan Whitehorn about the camisr issue.  The issue we
are seeing we mostly know, but to summarize, we are losing the camisr
signal and the camisr is not being run.

I gave him a summary of what we have been seeing and pointed him to the
code I am concerned about here:
http://pastebin.com/tLKr7mCV  (this is inside of kern_intr.c).

What I think that is happening is that the setting of it_need to 0
inside of sys/kern/kern_intr.c:ithread_loop() is not being scheduled
correctly and it is being delayed until AFTER the call to
ithread_execute_handlers() right below the atomic_store_rel_int().

This seems highly unlikely, to the extent that if this were true all our
locking primitives would be broken.  The store_rel is actually a release
barrier which acts like more a read/write fence.  No memory accesses (read or
write) from before the release can be scheduled after the associated store,
either by the compiler or CPU.  That is what Konstantin is referring to in his
commit when he says release semantics.

Yes, that makes sense, however does it specify that the writes *must*
occur at that *point*?  If it only enforces ordering then we may have
some issue, specifically because the setting of it to '1' inside of
intr_event_schedule_thread has no barrier other than the acq semantics
of the thread lock.  I am wondering what is forcing out the '1' there.

Nothing ever forces writes.  You would have to invalidate the cache to do that
and that is horribly expensive.  It is always only about ordering and knowing
that if you can complete another operation on the same cookie variable with
acquire semantics that earlier writes will be visible.

By cookie, you mean a specific memory address, basically a lock? This is
starting to reinforce my suspicions as the setting of it_need is done
with release semantics, however the acq on the other CPU is done on the
thread lock.  Maybe that is irrelevant.  We will find out shortly.


See below as I think we have proof that this is somehow happening.

Having ih_need of 1 and it_need of 0 is certainly busted.  The simplest fix
is probably to stop using atomics on it_need and just grab the thread lock
in the main ithread loop and hold it while checking and clearing it_need.


OK, we have some code that will either prove this, or perturb the memory
ordering enough to make the bug go away, or prove this assertion wrong.

We will update on our findings hopefully in the next few days.

IMO the read of it_need in the 'while (ithd-it_need)' should
have acquire semantic, otherwise the future reads in the
ithread_execute_handlers(), in particular, of the ih_need, could pass
the read of it_need and cause the situation you reported.  I do not
see any acquire barrier between a condition in the while() statement
and the read of ih_need in the execute_handlers().

It is probably true that the issue you see was caused by r236456, in the
sense that implicitely locked xchgl instruction on x86 has a full barrier
semantic.  As result, the store_rel() was actually an acquire too, making
this reordering impossible.  I argue that this is not a bug in r236456,
but the issue in the kern_intr.c.
If I remember the code correctly that would probably explain why we see 
it only on 9.1 system.


On the other hand, the John' suggestion to move the manipulations of
it_need under the lock is probably the best anyway.

I was wondering if it would be lower latency to maintain it_need, 
however to keep another variable it_needlocked under the thread lock.  
This would result in potential superfluous interrupts, however under 
load you would allow the ithread to loop without taking the thread lock 
some number of times.


I am not really sure if this is really worth the optimization 
(especially since it can result in superfluous interrupts) however it 
may reduce latency and that might be important.


Is there some people that I can pass the patch onto for help with 
performance once we confirm that this is the actual bug?   We can do 
internal testing, but I am worried about regressing performance of any 
form of IO for the kernel.


I'll show the patch soon.

Thank you for the information.  This is promising.

-Alfred



___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: FreeBSD installers and future direction

2013-05-27 Thread Alfred Perlstein

On 5/26/13 10:03 AM, Dirk Engling wrote:

On 26.05.13 04:51, Super Bisquit wrote:

Please don't turn this into an architecture dependent mess. PCBSD is
i386  AMD64 only.

Read my email thoroughly and notice that I never seriously considered
using pc-sysinstall after looking into it. Don't worry.



Why is that exactly?

A number of people are using it successfully to install ZFS based systems.

-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: FreeBSD installers and future direction

2013-05-27 Thread Alfred Perlstein

On 5/25/13 8:45 PM, Teske, Devin wrote:

On May 25, 2013, at 7:51 PM, Super Bisquit wrote:


Please don't turn this into an architecture dependent mess. PCBSD is i386 
AMD64 only.


There's a GSoC project (of which I'm potential mentor) to fix that.

However, you are entirely right… we can't in all seriousness even think about 
using pc-sysinstall until it is solid on all architectures as bsdinstall 
already is.

GSoC project is: Making pc-sysinstall FreeBSD ready by porting it to multiple 
architectures



Why can we not use in the interim use pc-sysinstall on the platforms 
that it performs best on and use bsdinstall on the others?


It doesn't make sense for us to hold up some platform like this at all.

Maybe no one has thought of this?  Basically use pc-sysinstall on amd64 
and i386 and use bsdinstall on the other platforms until they catch up?


-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: FreeBSD installers and future direction

2013-05-27 Thread Alfred Perlstein

On 5/27/13 9:56 AM, Bruce Cran wrote:

On 27/05/2013 16:48, Alfred Perlstein wrote:
Why can we not use in the interim use pc-sysinstall on the platforms 
that it performs best on and use bsdinstall on the others?


Because pc-sysinstall doesn't have a UI - it's only a backend. If we 
update bsdinstall to use it, then it won't work on other platforms.


This still doesn't make sense to me.  Why can bsdinstall not 
conditionally use it?


Do we always have to seek the lowest common denominator for our user 
experience?


-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: FreeBSD installers and future direction

2013-05-27 Thread Alfred Perlstein

On 5/27/13 11:40 AM, Bruce Cran wrote:

On 27/05/2013 19:03, Alfred Perlstein wrote:
Do we always have to seek the lowest common denominator for our user 
experience?


Yes.


Is this a joke?

-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: FreeBSD installers and future direction

2013-05-27 Thread Alfred Perlstein

On 5/27/13 2:23 PM, Bruce Cran wrote:

On 27/05/2013 21:28, Alfred Perlstein wrote:

On 5/27/13 11:40 AM, Bruce Cran wrote:

Yes.

Is this a joke?


It probably /was/ too short a reply. Personally I think there should 
be a single UI and scripting interface across all platforms. We should 
try and get pc-sysinstall running on all of them first in case there's 
some problem that means it can't be done, in which case we'd need to 
use a different backend.




There are just going to be certain platforms that make it EASY to do 
cool things.  We should embrace that!  That's why there are different 
platforms!


Some are great for low power, others are great for graphics, cpu power, 
gpu, networking etc.


If we always go for the lowest common denominator then we are crippling 
all the platforms for no one's benefit.  Even if something CAN be done, 
if it is very difficult, or just never happening, then we can't limit 
everyone's experience based on the more difficult and/or resource 
strapped platforms.


It's just not good business.

-Alfred

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: FreeBSD installers and future direction

2013-05-27 Thread Alfred Perlstein

On 5/27/13 6:53 PM, Nathan Whitehorn wrote:

On 05/27/13 20:40, Alfred Perlstein wrote:

On 5/27/13 2:23 PM, Bruce Cran wrote:

On 27/05/2013 21:28, Alfred Perlstein wrote:

On 5/27/13 11:40 AM, Bruce Cran wrote:

Yes.

Is this a joke?


It probably /was/ too short a reply. Personally I think there should 
be a single UI and scripting interface across all platforms. We 
should try and get pc-sysinstall running on all of them first in 
case there's some problem that means it can't be done, in which case 
we'd need to use a different backend.




There are just going to be certain platforms that make it EASY to do 
cool things.  We should embrace that!  That's why there are different 
platforms!


Some are great for low power, others are great for graphics, cpu 
power, gpu, networking etc.


If we always go for the lowest common denominator then we are 
crippling all the platforms for no one's benefit.  Even if something 
CAN be done, if it is very difficult, or just never happening, then 
we can't limit everyone's experience based on the more difficult 
and/or resource strapped platforms.


It's just not good business.


Yes, and all of this cuts both ways: pc-sysinstall has no wireless 
setup support, for instance. Right now we support what we support 
because it is the most feature-complete thing we have, not just on 
tier-2 platforms but also on x86.


To bring this discussion back to the ground, the fact is that we lack 
an installer that has both internal support for ZFS and a UI. One of 
the reasons for this is that making a good expressive UI for ZFS is a 
non-trivial undertaking given its enormous flexibility. The bsdinstall 
partition editor has been written to be extensible for this, and 
several people have started writing code to do it, but no one ended up 
having time to finish. Probably a reasonable thing to do is to start 
with supporting only a minimal set of features. If anyone felt like 
actually writing this code, I'm sure it would be appreciated by all 
and be more productive than email exchanges.

-Nathan


I'm sure if there was a list of reasonable things, such as wireless then 
pc-sysinstall could be augmented.  This is the first I've heard of 
that.  All the other complaints have been based on portability.


Is that all that is required now, wireless?



-Alfred

-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: syscall to userland interface

2013-05-11 Thread Alfred Perlstein

On 5/11/13 1:23 AM, Karl Dreger wrote:


I am feeling rather stupid at the moment, but I can't find the assembler

files that you are referring to. Do you mean that every syscall under

sys/kern/*.c has a corresponding .S file in src/lib/libc/?


Nope, the .S files are under the object directory:


When you build the system a whole bunch of assembler files are
automatically generated that define the functions you are looking for.

Look for .S files under the object directory.

Those assembler files have the magic to cause a system call to happen.

example: src/lib/libc/getauid.S  (note, this file is GENERATED, it's not
part of src.)







The actual transition from user to kernelland and back probably takes

place via the assembler routines in sys/i386/i386. Most notably exception.s

for my i386 cpu.


What my question boils down to is this: when running fork and friends

from userland they are invoked as:

fork();, open();, read();, close(); ...


but are defined as:

sys_fork(), sys_open(), sys_read(), sys_close(), ...

in their actual c definition.

If the assembler files that you spoke about answer this discrepancy,

then the reason why the penny hasn't dropped yet is because I haven't
found them.


Again, they are generated as part of build.  You will NOT find them 
during a checkout.


-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: syscall to userland interface

2013-05-10 Thread Alfred Perlstein

On 5/10/13 12:31 PM, Karl Dreger wrote:

Hello,
I have been taking a look at a few syscalls in /usr/src/sys/kern/ and
always find that in their actuall c definition the function names are
preprended by a sys_. Take for example the fork system call which
is found in /usr/src/sys/kern/kern_fork.c

int
sys_fork(struct thread *td, struct fork_args *uap)
...

Now when I write a program from userland, that makes use of the
fork system call, then if call it as:

fork();

All the syscall are part of libc, which is usually defined in
/usr/src/lib/libc/

Since the system calls are already defined in the kernel sources, they
no longer need to be defined in /usr/src/lib/libc/. This is the reason
why one can only find the manpages and no c files in
/usr/src/lib/libc/sys?
At least this is how my thinking goes.

Now, when the syscalls in the kernel sources are all defined as sys_xxx
but are invoked as xxx and the c headers also show syscall prototypes
without any prepended sys. How does the actual user-, kernelland
move happen? In other words, why do I invoke fork() as fork() and
not as sys_fork()?

Or is there something that I missed?


Clarification on that point is highly welcome.


When you build the system a whole bunch of assembler files are 
automatically generated that define the functions you are looking for.


Look for .S files under the object directory.

Those assembler files have the magic to cause a system call to happen.

example: src/lib/libc/getauid.S  (note, this file is GENERATED, it's not 
part of src.)




-Alfred

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


potential future proofing fix for aicasm build.

2013-05-01 Thread Alfred Perlstein


Hey folks,

I took a shot at fixing this issue with building aicasm as part of
buildkernel of an older 9.0 src on a machine running HEAD.

aicasm.o: In function `__getCurrentRuneLocale': 
/usr/include/runetype.h:96: undefined reference to `_ThreadRuneLocale'

The issue seems to be two-fold:

1) Paths are not fully set to pick up the bootstrap tools needed to build.
2) include files use the host's instead of the build trees.

The first problem is fixed by changing setting of PATH from
${BPATH}:${PATH} to ${TMPPATH}.

The second is fixed by using -nostdinc and setting strict include paths
using -I directives to the compiler:

CFLAGS=-nostdinc -I${WORLDTMP}/usr/include -I. 
-I${KERNSRCDIR}/dev/aic7xxx/aicasm



Can I get review on this patch?

https://gist.github.com/anonymous/5493734

Inline:

diff --git a/Makefile.inc1 b/Makefile.inc1
index e850cda..785e3180 100644
--- a/Makefile.inc1
+++ b/Makefile.inc1
@@ -830,17 +830,18 @@ buildkernel:
 @echo  stage 2.3: build tools
 @echo --
 cd ${KRNLOBJDIR}/${_kernel}; \
-   PATH=${BPATH}:${PATH} \
+   PATH=${TMPPATH} \
 MAKESRCPATH=${KERNSRCDIR}/dev/aic7xxx/aicasm \
-   ${MAKE} SSP_CFLAGS= -DNO_CPU_CFLAGS -DNO_CTF \
+   ${MAKE} SSP_CFLAGS= -DNO_CPU_CFLAGS -DNO_CTF CFLAGS=-nostdinc 
-I${WORLDTMP}/usr/include -I. -I${KERNSRCDIR}/dev/aic7xxx/aicasm \
 -f ${KERNSRCDIR}/dev/aic7xxx/aicasm/Makefile
  # XXX - Gratuitously builds aicasm in the ``makeoptions NO_MODULES'' case.
  .if !defined(MODULES_WITH_WORLD)  !defined(NO_MODULES)  
exists(${KERNSRCDIR}/modules)
  .for target in obj depend all
+   @echo  aicasm: ${target} 
 cd ${KERNSRCDIR}/modules/aic7xxx/aicasm; \
-   PATH=${BPATH}:${PATH} \
+   PATH=${TMPPATH} \
 MAKEOBJDIRPREFIX=${KRNLOBJDIR}/${_kernel}/modules \
-   ${MAKE} SSP_CFLAGS= -DNO_CPU_CFLAGS -DNO_CTF ${target}
+   ${MAKE} SSP_CFLAGS= -DNO_CPU_CFLAGS -DNO_CTF CFLAGS=-nostdinc 
-I${WORLDTMP}/usr/include -I. -I${KERNSRCDIR}/dev/aic7xxx/aicasm ${target}
  .endfor
  .endif
  .if !defined(NO_KERNELDEPEND)






___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: potential future proofing fix for aicasm build.

2013-05-01 Thread Alfred Perlstein

On 5/1/13 9:57 AM, Steven Hartland wrote:

I don't believe aicasm is actually needed if you don't have a driver
which requires e.g. ahd or ahc. It would be good to get that fixed too.


True, but a challenge I don't currently have time for.

I'm about to kick-off a universe build with my changes in aicasm, but I 
was hoping someone could tell me if I was going in the right direction here.


-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


potential future proofing fix for aicasm build.

2013-05-01 Thread Alfred Perlstein

Hey folks,

I took a shot at fixing this issue with building aicasm as part of 
buildkernel of an older 9.0 src on a machine running HEAD.


aicasm.o: In function `__getCurrentRuneLocale':  
/usr/include/runetype.h:96: undefined reference to `_ThreadRuneLocale'


The issue seems to be two-fold:

1) Paths are not fully set to pick up the bootstrap tools needed to build.
2) include files use the host's instead of the build trees.

The first problem is fixed by changing setting of PATH from 
${BPATH}:${PATH} to ${TMPPATH}.


The second is fixed by using -nostdinc and setting strict include paths 
using -I directives to the compiler:


CFLAGS=-nostdinc -I${WORLDTMP}/usr/include -I. 
-I${KERNSRCDIR}/dev/aic7xxx/aicasm



Can I get review on this patch?

https://gist.github.com/anonymous/5493734

Inline:

diff --git a/Makefile.inc1 b/Makefile.inc1
index e850cda..785e3180 100644
--- a/Makefile.inc1
+++ b/Makefile.inc1
@@ -830,17 +830,18 @@ buildkernel:
@echo  stage 2.3: build tools
@echo --
cd ${KRNLOBJDIR}/${_kernel}; \
-   PATH=${BPATH}:${PATH} \
+   PATH=${TMPPATH} \
MAKESRCPATH=${KERNSRCDIR}/dev/aic7xxx/aicasm \
-   ${MAKE} SSP_CFLAGS= -DNO_CPU_CFLAGS -DNO_CTF \
+   ${MAKE} SSP_CFLAGS= -DNO_CPU_CFLAGS -DNO_CTF CFLAGS=-nostdinc 
-I${WORLDTMP}/usr/include -I. -I${KERNSRCDIR}/dev/aic7xxx/aicasm \
-f ${KERNSRCDIR}/dev/aic7xxx/aicasm/Makefile
 # XXX - Gratuitously builds aicasm in the ``makeoptions NO_MODULES'' case.
 .if !defined(MODULES_WITH_WORLD)  !defined(NO_MODULES)  
exists(${KERNSRCDIR}/modules)
 .for target in obj depend all
+   @echo  aicasm: ${target} 
cd ${KERNSRCDIR}/modules/aic7xxx/aicasm; \
-   PATH=${BPATH}:${PATH} \
+   PATH=${TMPPATH} \
MAKEOBJDIRPREFIX=${KRNLOBJDIR}/${_kernel}/modules \
-   ${MAKE} SSP_CFLAGS= -DNO_CPU_CFLAGS -DNO_CTF ${target}
+   ${MAKE} SSP_CFLAGS= -DNO_CPU_CFLAGS -DNO_CTF CFLAGS=-nostdinc 
-I${WORLDTMP}/usr/include -I. -I${KERNSRCDIR}/dev/aic7xxx/aicasm ${target}
 .endfor
 .endif
 .if !defined(NO_KERNELDEPEND)



___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: potential future proofing fix for aicasm build.

2013-05-01 Thread Alfred Perlstein

On 5/1/13 2:38 PM, Brooks Davis wrote:

On Wed, May 01, 2013 at 09:44:54AM -0700, Alfred Perlstein wrote:

Hey folks,

I took a shot at fixing this issue with building aicasm as part of
buildkernel of an older 9.0 src on a machine running HEAD.

aicasm.o: In function `__getCurrentRuneLocale': 
/usr/include/runetype.h:96: undefined reference to `_ThreadRuneLocale'

The issue seems to be two-fold:

1) Paths are not fully set to pick up the bootstrap tools needed to build.
2) include files use the host's instead of the build trees.

The first problem is fixed by changing setting of PATH from
${BPATH}:${PATH} to ${TMPPATH}.

The second is fixed by using -nostdinc and setting strict include paths
using -I directives to the compiler:

CFLAGS=-nostdinc -I${WORLDTMP}/usr/include -I. 
-I${KERNSRCDIR}/dev/aic7xxx/aicasm

This seems basically ok.


Can I get review on this patch?

The line wrapping bugs should have been fixed before posting, but it
otherwise looks fine.

I do wonder why we don't just install aicasm in the base and bootstrap
it in the unlikely event that it changes in an important way.  A quick
scan of svn log suggests that gibbs fixed a bug in mid-2010 and the last
non-build system or portability change was circa 2003 so I don't think
we'd break old-style kernel builds at a rate worth worrying about.


It looks sort of like a shortcut was taken so that changes to the tool 
can be picked up by a kernel compile instead of needing another step.  
That was probably convenient at the time, but now is somewhat of a problem.


If I have time I will see about moving it to base.

Thank you for the review.  I will fix the white space and give make 
universe a whirl now.


-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Adding a FOREACH_CONTINUE() variant to queue(3)

2013-04-30 Thread Alfred Perlstein

On 4/30/13 8:57 PM, Lawrence Stewart wrote:

[reposting from freebsd-arch@ - was probably the wrong list]

Hi all,

I've had use for these a few times now when wanting to restart a loop at
a previously found element, and wonder if there are any thoughts about
sticking them (and equivalents for other list types) in sys/queue.h?

Cheers,
Lawrence

#define TAILQ_FOREACH_CONTINUE(var, head, field)\
for ((var) = ((var) ? (var) : TAILQ_FIRST((head))); \
(var);  \
(var) = TAILQ_NEXT((var), field))


#define SLIST_FOREACH_CONTINUE(var, head, field)\
for ((var) = ((var) ? (var) : SLIST_FIRST((head))); \
(var);  \
(var) = SLIST_NEXT((var), field))
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org



Can you show a few uses please?  If it can significantly cut down on 
extra code it seems wise.


-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Multiple page size support on FreeBSD?

2013-04-10 Thread Alfred Perlstein

On 4/10/13 11:42 AM, Benjamin Kaduk wrote:

On Wed, 10 Apr 2013, Wojciech Puchar wrote:

How do your tests work?  Do you examine PTEs directly to check for 
superpages

or are you relying on the vm.pmap.pde sysctls?


the later.

anyway - algorithm described on list - that heuristics detects 
consecutive page access doesn't really help the urgent case - RANDOM 
access to large amount of memory.


The algorithm is not a heuristic based on consecutive accesses, 
promotion occurs when the entire superpage's worth of memory has 
actually been accessed.  If I remember correctly, the performance gain 
from superpages was only a few percent, so spending more time trying 
to decide when to use them would make the algorithm a net wash.


You should really watch the talk I linked to if you're interested, it 
was quite interesting.



sequential access will get minimal improvement.

IMHO the only way that really make sens is to add options to madvise 
to give kernel information about usage.


Maybe.


It is cool that FreeBSD got this work via Alan Cox and the others that 
contributed.


I am wondering if it makes sense to have an explicit model.

At one place, for a platform with high performance but a very small TLB, 
we made it possible to explicitly request a large TLB for our process 
and it made a BIG difference for performance.


Sometimes being general purpose means that you can expose such low 
level things for the user to tune instead of requiring them to fit the 
app to a heuristic that may change.


-Alfred




-Ben Kaduk
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to 
freebsd-hackers-unsubscr...@freebsd.org




___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Multiple page size support on FreeBSD?

2013-04-10 Thread Alfred Perlstein

On 4/10/13 1:09 PM, Andrew Duane wrote:

Like all performance items (especially VM), it depends on the hardware and 
the load. On systems with small TLBs it helps more than with large TLBs. With software 
that needs access to lots of different areas the TLB gets more traffic so large ones help 
more. The answer for your firefox browser box with i386 is probably different from my 
compilation engine running MIPS, or his web server running AMD.

Back at Digital, we spent a lot of time trying to find the one true answer to 
superpages, only to discover there wasn't one. We ended up with a combination of 
automatic use from big allocations, a rarely used API to advise for big TLBs, and some 
background work that coalesced when possible.


Thank you Andrew.  I agree.  A good heuristic is great, but sometimes 
exposing the API unlocks some really awesome performance capabilities.


It seems like both Digital and Sun went this route.

I'm hoping we can do that as well.

-Alfred


  
Andrew L. Duane
Resident Architect - ATT Technical Lead
m   +1 603.770.7088
o   +1 408.933.6944 (2-6944)
skype: andrewlduane
adu...@juniper.net



-Original Message-
From: owner-freebsd-hack...@freebsd.org 
[mailto:owner-freebsd-hack...@freebsd.org] On Behalf Of Alfred Perlstein
Sent: Wednesday, April 10, 2013 4:00 PM
To: Benjamin Kaduk
Cc: Wojciech Puchar; Sebastian Feld; freebsd-hackers@freebsd.org
Subject: Re: Multiple page size support on FreeBSD?

On 4/10/13 11:42 AM, Benjamin Kaduk wrote:

On Wed, 10 Apr 2013, Wojciech Puchar wrote:


How do your tests work?  Do you examine PTEs directly to check for
superpages or are you relying on the vm.pmap.pde sysctls?

the later.

anyway - algorithm described on list - that heuristics detects
consecutive page access doesn't really help the urgent case - RANDOM
access to large amount of memory.

The algorithm is not a heuristic based on consecutive accesses,
promotion occurs when the entire superpage's worth of memory has
actually been accessed.  If I remember correctly, the performance gain
from superpages was only a few percent, so spending more time trying
to decide when to use them would make the algorithm a net wash.

You should really watch the talk I linked to if you're interested, it
was quite interesting.


sequential access will get minimal improvement.

IMHO the only way that really make sens is to add options to madvise
to give kernel information about usage.

Maybe.

It is cool that FreeBSD got this work via Alan Cox and the others that 
contributed.

I am wondering if it makes sense to have an explicit model.

At one place, for a platform with high performance but a very small TLB, we 
made it possible to explicitly request a large TLB for our process and it made 
a BIG difference for performance.

Sometimes being general purpose means that you can expose such low level 
things for the user to tune instead of requiring them to fit the app to a heuristic that 
may change.

-Alfred



-Ben Kaduk
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to
freebsd-hackers-unsubscr...@freebsd.org


___
freebsd-hackers@freebsd.org mailing list 
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org



___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: GSOC 2013 project Kernel Size Reduction for Embedded System

2013-04-09 Thread Alfred Perlstein

On 4/8/13 6:42 PM, Adrian Chadd wrote:

Well, it's relatively easy to experience what it's like.


No it's not.  We all have jobs that demand different things from us.  
Taking the time to guess at the problem, only to be told you're doing 
it wrong by someone actually in the position to build the list of 
requirements is not only a big honking waste of time but not fun nor 
interesting.


Either gather some acceptable feature/performance regressions together 
that small can live with or stop evangelizing.  Looking at .o files 
and guessing what to trim isn't going to work.


It sounds like what you want is some magic where you get all the 
features and your small image while not having to compromise on 
features/speed.  Cool, maybe someone will invent something amazing that 
gives you less from more, but until then it makes sense to actually be 
pragmatic and put together a list of things to trim based on sacrifice, 
not just because they are big.


-Alfred



Reboot your machine with 32mb. Try to do things like bring up network
interfaces. Snark when stupid stuff occurs, like you can't allocate
enough mbufs for the driver RX path _and_ run the ifconfig command to
completion to bring said interface up.

There's just a lot of code. You can start by cross-building one of the
MIPS kernels targeting a small system (eg AP121) and look at the
text/data sections of the resulting .o's. Group them together into
subsystems and take a look.

Now, as for what we can get away with - I'm still going through
another round of review. Yes, there's likely a bunch of syscalls or
syscall behaviours that we just don't need in the embedded world.
Things like all the POSIX compatible fine grained locking? Likely
don't need. But there's some reasonably big areas of bloat that we
could easy hit right now. I've chopped out some of the more silly
abuses in the past (posix acl code that only gets used by ZFS, always
being compiled in? Sigh.)

Eg:

textdata bss dec hex filename
   59772 160   49184  109116   1aa3c kern_umtx.o

That's a lot of both code and bss just for mutex handling, don't you
think? Do we really need 59KiB of code and 48KiB of BSS just for mutex
handling?

textdata bss dec hex filename
 184   0   12160   123443038 sc_machdep.o

.. 8 consoles? 12k of BSS? again, not much, but ..

adrian@lucy:~/work/freebsd/svn/src/sys/cam] cat
/tmp/AP121-nodebug.txt | egrep 'ata'
textdata bss dec hex filename
   11536   0   0   115362d10 ata_all.o
   176241504  16   191444ac8 ata_da.o
6388 448  1668521ac4 ata_pmp.o
   18960 304   0   192644b40 ata_xpt.o

.. 52 odd KiB tied up in CAM ATA transport, which we don't use unless
the ATA code is compiled in. It's just sitting there, waiting for an
ATA device to come along.

lucy# cat /tmp/AP121-nodebug.txt | grep vfs_ | grep -v devfs | sort -k3
4160  48   042081070 vfs_acl.o
4752  48   0480012c0 vfs_export.o
5464   0   054641558 vfs_extattr.o
8128 288   0841620e0 vfs_default.o
   11020 160   0   111802bac vfs_cluster.o
7916  96  1680281f5c vfs_lookup.o
   19908 144  16   200684e64 vfs_vnops.o
   34504 208  16   3472887a8 vfs_syscalls.o
3068  64  323164 c5c vfs_hash.o
   22700 208  32   22940599c vfs_mount.o
1760 144 1602064 810 vfs_init.o
   14520  16 160   146963968 vfs_mountroot.o
   139961568 176   157403d7c vfs_cache.o
   648521680 256   66788   104e4 vfs_subr.o
   521882000 304   54492d4dc vfs_bio.o

.. 260KiB just for VFS handling.

etc, etc.

I'd love to fix this, but I have to make a choice right now between
porting to more of the Atheros wifi/soc platforms, or tackling this
particular issue. I'd love for others to help out here. I'm sure that
reducing code size in general is going be beneficial on the lower end
platforms, even just in cache savings.

Thanks,


Adrian
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org



___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: GSOC 2013 project Kernel Size Reduction for Embedded System

2013-04-09 Thread Alfred Perlstein

On 4/9/13 10:36 AM, Wojciech Puchar wrote:

happy that FreeBSD is among the selected organization.

I am a third year student interested to work in the field of embedded
system. I applied last year and the title of my project was  Kernel 
Size

why only in embedded system. smaller programs are always good :)

And yes FreeBSD kernel is huge. doesn't really matter with 1GB or more 
RAM but yes - it is huge even relative to linux.


Ah, any insight as to why?

-Alfred

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: GSOC 2013 project Kernel Size Reduction for Embedded System

2013-04-08 Thread Alfred Perlstein

On 4/8/13 4:10 PM, Adrian Chadd wrote:

Hi,

Your idea is interesting, but it doesn't fix the underlying problem -
there's just too much code. :(

If you were to API'ify some of the more basic things such as fget, 
fdrop, filedesc stuff you could potentially swap out the systems for 
simpler (albeit less efficient) algorithms, the cost there may be slow 
smp performance, or maybe not allowing threads?


What we really need is someone to pin down those parts of code that 
smaller systems may not need and provide compromise when we remove them.


Other ideas are simple like for instance removing certain syscalls (for 
example, more recent ones such as openat) and features such as unix 
descriptor passing.


However, until a bunch of embedded folks come forward and state what 
they are really willing to sacrifice, then we won't really have anything 
to go on, and it will be guessing at what will work for a space that not 
all of us are familiar with.


So I'm hoping some people can make the tough call to give direction 
here, otherwise nothing good will come of it.


Has anyone actually done this?  Or maybe compared against another 
embedded OS?


-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: considering i386 as a tier 1 architecture

2013-04-02 Thread Alfred Perlstein

As far I can tell it's now April 2nd in all time zones.

Can we now end this thread?

thank you,
-Alfred


On 4/2/13 6:22 AM, Paul Schenkeveld wrote:

On Tue, Apr 02, 2013 at 10:22:20AM +, Ruben de Groot wrote:

On Tue, Apr 02, 2013 at 03:10:56AM -0700, Mehmet Erol Sanliturk typed:

On Tue, Apr 2, 2013 at 2:04 AM, Dag-Erling Sm??rgrav d...@des.no wrote:


Wojciech Puchar woj...@wojtek.tensor.gdynia.pl writes:

Lev Serebryakov l...@freebsd.org writes:

It is not exact so. Some Atoms on some motherboards with some
firmwares are 64-bit CPU.

don't know of any now in shops that are not

http://soekris.com/products/net5501.html
http://soekris.com/products/net6501.html

DES
--
Dag-Erling Sm??rgrav - d...@des.no



I am NOT able to understand the merit of these products with respect to
their features and PRICES .

They are extremely stable and robust.


It is possible to assemble much more cheaper full featured PC like systems
( DDR3 memory , 64-bit capable processors , with a disadvantage about power
requirements ) .

You can also get much bigger portions at MacDonald than what you get in a
five star restaurant.

Soekris boards are perhaps not five star boards but at least they have
four :)

Although the thread started as an april fools day prank, it's getting
serious now about the value of having i386 next to amd64.

I'm using quite a number of net4501/net4801/net5501/net6501 in many
places just because I haven't found anything that can to the same job
with the same reliability at the same low power diet for a reasonable
price.

For people on a tight budget with lower reliability expectations there
are the PC-engines Alix boards.  Except for the net6501, none of these
can run amd64.

Even though the net6501 can run amd64, I prefer running i386 on them
(and other boards where I do not need = 4GB of RAM or the large address
space) instead of amd64 just because the system image is so much smaller,
requiring less storage on your filesystem (often a small flash device),
less time to upload changes over the Internet when doing remote upgrades
and they are more efficient with virtual memory.  Running amd64 when not
really needed is just a waste of resources.

There have been discussions in the past whether is would make sense to
run a 32-bit userland on top of a amd64 kernel sou you can have 4GB of
RAM but keep your userland relatively small.  There are only few
applications that really benefit from 64 bit address space, others could
well be 32 bit apps.

Just some random numbers to illustrate my point:

i386$ size /bin/sh /bin/ls /usr/bin/find /usr/bin/cc

textdata bss dec hex filename
  11153310487460  120041   1d4e9 /bin/sh
   22808 572 396   237765ce0 /bin/ls
   33098 7603392   372509182 /usr/bin/find
  3148419376   18204  342421   53995 /usr/bin/cc

amd64$ size /bin/sh /bin/ls /usr/bin/find /usr/bin/cc

textdata bss dec hex filename
  1293711992   10272  141635   22943 /bin/sh
   262551144 536   279356d1f /bin/ls
   4346413524680   49496c158 /usr/bin/find
  383330   15296   58664  457290   6fa4a /usr/bin/cc

Kind regards,

Paul Schenkeveld
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org



___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: pop filters from kqueue

2013-03-05 Thread Alfred Perlstein

On 3/5/13 8:03 AM, Dirk Engling wrote:

Dear fellow FreeBSD hackers,

while writing a daemon that uses a kqueue to keep track of forked 
processes and pipes to userland client code etc, I noticed a lack of 
features to implement a proper shutdown without holding data redundantly.


When my daemon quits, I can not ask the kqueue for my installed 
filters and get back the udata I passed to the kqueue.


This is unfortunate, because I like the idea of having only one owner 
per memory allocation. The most obvious use would be a per-fd-state 
held in a memory block. When passing it to kevent() via the udata 
entry, I would make this filter the owner of my allocation.


However, when gracefully shutting down, my daemon has no way of 
retrieving all the values passed to the filters. For most cases that 
may be okay:

memory allocations will just be thrown away on exit(), anyway.

But once I need to clean up external state, like sending signals to 
processes I installed an EVFILT_PROC for etc, I need to keep a 
redundant list of pids and the associated udata. This violates the 
rule of strict ownership and introduces room for inconsistencies.


Is there a specific reason I have overlooked that would forbid popping 
untriggered filters from my kqueue? Or is there even a way to do so 
that I have missed?


I'm not sure if kqueue support this, however adding such a facility 
might be OK.


Another way to handle this is just to make your udata pointers all point 
to items in a doubly linked list with the head structure aware of which 
kqueue.


Then at the end you can just traverse the list of items you have not yet 
popped off.


The only pain here is that it requires managing a doubly linked list and 
additional pointer dereferences.


-Alfred

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: memory allocation in spinlock context

2013-03-01 Thread Alfred Perlstein

On 3/1/13 5:50 AM, Andriy Gapon wrote:

I am trying to understand if it is possible to allow memory allocations 
(M_NOWAIT,
of course) in a spinlock context.
I do not see any obvious architectural obstacles.
But the fact that all of the uma locks, system map lock, object locks, page 
queue
locks and so on are regular mutexes makes it impossible to allocate memory 
without
violating the fundamental lock ordering rules.

Could all of the above mentioned locks potentially be converted to spin mutexes?
(And that seems to be a large nasty change)
Are there any alternative possibilities?

BTW, currently we have at least one place where a memory allocation of this kind
is done stealthily (and thus dangerously?).  ACPI resume code must execute
AcpiLeaveSleepStatePrep with interrupts disabled and ACPICA code performs memory
allocations in that code path.  Since the interrupts are disabled by means of
intr_disable(), witness(9) and similar are completely oblivious of the fact.

Typically the need for such a facility means that the locks are being 
held for too long.


I think someone has suggested using a private allocator carving out of a 
pre-allocated space.


Depending on the subsystem you are allocating for this may work for you.

I am looking to do this for the kernel gzip routines so that we can do 
compressed kernel dumps as soon as I verify the bounds of the gzip 
allocations.


-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: removing plip from GENERIC

2013-01-30 Thread Alfred Perlstein
Does plip no longer work?

Sent from my iPhone

On Jan 30, 2013, at 4:39 PM, Eitan Adler li...@eitanadler.com wrote:

 There has been some discussion about removing plip support from GENERIC 
 kernels.
 plip still appears in sys/conf/NOTES
 
 Does anyone object to the following?
 
 commit f4efd3cf43514bcb1378e2c5e8879a411b943be2
 Author: Eitan Adler li...@eitanadler.com
 Date:   Mon Jan 28 15:13:57 2013 -0500
 
Remove support for plip from the GENERIC kernel as no systems in the
last 10 years require this support.
 
Discussed with:db
Discussed with:imp
Reviewed by:-hackers
Approved by:??? (mentor)
 
 diff --git a/sys/amd64/conf/GENERIC b/sys/amd64/conf/GENERIC
 index e53f692..5819a0d 100644
 --- a/sys/amd64/conf/GENERIC
 +++ b/sys/amd64/conf/GENERIC
 @@ -197,7 +197,6 @@ deviceuart# Generic UART driver
 deviceppc
 deviceppbus# Parallel port bus (required)
 devicelpt# Printer
 -deviceplip# TCP/IP over parallel
 deviceppi# Parallel port interface device
 #devicevpo# Requires scbus and da
 
 diff --git a/sys/i386/conf/GENERIC b/sys/i386/conf/GENERIC
 index 819379e..47af43b 100644
 --- a/sys/i386/conf/GENERIC
 +++ b/sys/i386/conf/GENERIC
 @@ -208,7 +208,6 @@ deviceuart# Generic UART driver
 deviceppc
 deviceppbus# Parallel port bus (required)
 devicelpt# Printer
 -deviceplip# TCP/IP over parallel
 deviceppi# Parallel port interface device
 #devicevpo# Requires scbus and da
 
 diff --git a/sys/pc98/conf/GENERIC b/sys/pc98/conf/GENERIC
 index 2b048a9..eda1d14 100644
 --- a/sys/pc98/conf/GENERIC
 +++ b/sys/pc98/conf/GENERIC
 @@ -151,7 +151,6 @@ devicemse
 deviceppc
 deviceppbus# Parallel port bus (required)
 devicelpt# Printer
 -deviceplip# TCP/IP over parallel
 deviceppi# Parallel port interface device
 #devicevpo# Requires scbus and da
 # OLD Parallel port
 diff --git a/sys/sparc64/conf/GENERIC b/sys/sparc64/conf/GENERIC
 index f9d3b93..79124ab 100644
 --- a/sys/sparc64/conf/GENERIC
 +++ b/sys/sparc64/conf/GENERIC
 @@ -161,7 +161,6 @@ deviceuart# Multi-uart driver
 #deviceppc
 #deviceppbus# Parallel port bus (required)
 #devicelpt# Printer
 -#deviceplip# TCP/IP over parallel
 #deviceppi# Parallel port interface device
 #devicevpo# Requires scbus and da
 
 
 
 -- 
 Eitan Adler
 ___
 freebsd-hackers@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
 
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Sockets programming question

2013-01-28 Thread Alfred Perlstein

On 1/28/13 10:11 AM, Ian Lepore wrote:

I've got a question that isn't exactly freebsd-specific, but
implemenation-specific behavior may be involved.

I've got a server process that accepts connections from clients on a
PF_LOCAL stream socket.  Multiple clients can be connected at once; a
list of them is tracked internally.  The server occasionally sends data
to each client.  The time between messages to clients can range
literally from milliseconds to months.  Clients never send any data to
the server, indeed they may shutdown that side of the connection and
just receive data.

The only way I can find to discover that a client has disappeared is by
trying to send them a message and getting an error because they've
closed the socket or died completely.  At that point I can reap the
resources and remove them from the client list.  This is problem because
of the months between messages thing.  A lot of clients can come and
go during those months and I've got this ever-growing list of open
socket descriptors because I never had anything to say the whole time
they were connected.

By trial and error I've discovered that I can sort of poll for their
presence by writing a zero-length message.  If the other end of the
connection is gone I get the expected error and can reap the client,
otherwise it appears to quietly write nothing and return zero and have
no other side effects than polling the status of the server-client side
of the pipe.

My problem with this polling is that I can't find anything in writing
that sanctions this behavior.  Would this amount to relying on a
non-portable accident of the current implementation?

Also, am I missing something simple and there's a cannonical way to
handle this?  In all the years I've done client/server stuff I've never
had quite this type of interaction (or lack thereof) between client and
server before.

I may be mistaken, but doesn't poll(2) allow you to see this as well?

I think you should see POLLHUP in revents if POLLOUT is set in events.

I think there also is an analogous way to do this with kevent as well.

-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: off topic but no idea where to ask

2013-01-15 Thread Alfred Perlstein

On 1/15/13 10:18 AM, Wojciech Puchar wrote:
does anyone know a PXE image (just like /boot/pxeboot) that can be 
placed on tftp server and the only thing it will do would be loading 
first sector from first local disk at 0x07c00 and booting as with 
normal hard drive.


what i need is to be able to decide from server side if given computer 
boots from NFS or hard disk.


This may not be helpful, but maybe you can have the server just deny the 
PXE client the infomration needed to boot and set the BIOS to boot order to:


1st: PXE
2nd: local disk

?

This way if the server doesn't respond to the pxe option then the client 
will then try local disk?


-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Getting the current thread ID without a syscall?

2013-01-15 Thread Alfred Perlstein

On 1/15/13 1:43 PM, Konstantin Belousov wrote:

On Tue, Jan 15, 2013 at 04:35:14PM -0500, Trent Nelson wrote:


 Luckily it's for an open source project (Python), so recompilation
 isn't a big deal.  (I also check the intrinsic result versus the
 syscall result during startup to verify the same ID is returned,
 falling back to the syscall by default.)

For you, may be. For your users, it definitely will be a problem.
And worse, the problem will be blamed on the operating system and not
to the broken application.


Anything we can do to avoid this would be best.

The reason is that we are still dealing with an optimization that perl 
did, it reached inside of the opaque struct FILE to do nasty things.  
Now it is very difficult for us to fix struct FILE.


We are still paying for this years later.

Any way we can make this a supported interface?

-Alfred


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: malloc+utrace, tracking memory leaks in a running program.

2013-01-10 Thread Alfred Perlstein

On 1/10/13 2:38 AM, Konstantin Belousov wrote:

On Thu, Jan 10, 2013 at 01:56:48AM -0500, Alfred Perlstein wrote:

Here are more convenient links that give diffs against FreeBSD and
jemalloc for the proposed changes:

FreeBSD:
https://github.com/alfredperlstein/freebsd/compare/13e7228d5b83c8fcfc63a0803a374212018f6b68~1...utrace2


Why  do you need to expedite the records through the ktrace at all ?
Wouldn't direct write(2)s to a file allow for better performance
due to not stressing kernel memory allocator and single writing thread ?
Also, the malloc coupling to the single-system interface would be
prevented.

I believe that other usermode tracers also behave in the similar way,
using writes and not private kernel interface.

Also, what /proc issues did you mentioned ? There is
sysctl kern.proc.vmmap which is much more convenient than /proc/pid/map
and does not require /proc mounted.


jemalloc:
https://github.com/alfredperlstein/jemalloc/compare/master...utrace2



Konstantin, you are right, it is a strange thing this utrace.  I am not 
sure why it was done this way.


You are correct in that much more efficient system could be made using 
writes gathered into a single write(2).


Do you think there is any reason they may have re-used the kernel paths 
for ktrace even at the cost of efficiency?


About kern.proc.vmmap I will look into that.

-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: malloc+utrace, tracking memory leaks in a running program.

2013-01-10 Thread Alfred Perlstein

On 1/10/13 1:05 PM, Konstantin Belousov wrote:

On Thu, Jan 10, 2013 at 10:16:46AM -0500, Alfred Perlstein wrote:

On 1/10/13 2:38 AM, Konstantin Belousov wrote:

On Thu, Jan 10, 2013 at 01:56:48AM -0500, Alfred Perlstein wrote:

Here are more convenient links that give diffs against FreeBSD and
jemalloc for the proposed changes:

FreeBSD:
https://github.com/alfredperlstein/freebsd/compare/13e7228d5b83c8fcfc63a0803a374212018f6b68~1...utrace2


Why  do you need to expedite the records through the ktrace at all ?
Wouldn't direct write(2)s to a file allow for better performance
due to not stressing kernel memory allocator and single writing thread ?
Also, the malloc coupling to the single-system interface would be
prevented.

I believe that other usermode tracers also behave in the similar way,
using writes and not private kernel interface.

Also, what /proc issues did you mentioned ? There is
sysctl kern.proc.vmmap which is much more convenient than /proc/pid/map
and does not require /proc mounted.


jemalloc:
https://github.com/alfredperlstein/jemalloc/compare/master...utrace2


Konstantin, you are right, it is a strange thing this utrace.  I am not
sure why it was done this way.

You are correct in that much more efficient system could be made using
writes gathered into a single write(2).

Even without writes gathering, non-coalesced writes should be faster than
utrace.


Do you think there is any reason they may have re-used the kernel paths
for ktrace even at the cost of efficiency?

I can only speculate. The utracing of the malloc calls in the context
of the ktrace stream is useful for the human reading the trace. Instead
of seeing the sequence of unexplanaible calls allocating and freeing
memory, you would see something more penetrable. For example, you would
see accept/malloc/read/write/free, which could be usefully interpreted
as network server serving the client.

This context is not needed for a leak detector.
Now I may be wrong here, but I think it's an artifact of someone 
noticing how useful fitting this into the ktrace system and leveraging 
existing code.


Even though there are significant performance deficiencies, the actual 
utility of the existing framework may have been such a stepping stool 
towards tracing that it was just used.


Right now the code already exists, however it logs just {operation, 
size, ptr}, example:

malloc, 512, - 0xdeadbeef
free, 0, 0xdeadbeef
realloc, 512, 0 - 0xdeadc0de
realloc, 1024, 0xdeadc0de - 0x
free, 0, 0x

What do you think of just adding the address of the caller of 
malloc/free/realloc to these already existing tracepoints?


-Alfred

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: malloc+utrace, tracking memory leaks in a running program.

2013-01-09 Thread Alfred Perlstein

On 12/23/12 12:28 PM, Jason Evans wrote:

On Dec 21, 2012, at 7:37 PM, Alfred Perlstein bri...@mu.org wrote:

So the other day in an effort to debug a memory leak I decided to take a look 
at malloc+utrace(2) and decided to make a tool to debug where leaks are coming 
from.

A few hours later I have:
1) a new version of utrace(2) (utrace2(2)) that uses structured data to prevent 
overloading of data.   (utrace2.diff)
2) changes to ktrace and kdump to decode the new format. (also in utrace2.diff)
3) changes to jemalloc to include the new format AND the function caller so 
it's easy to get the source of the leaks. (also in utrace2.diff)
4) a program that can take a pipe of kdump(1) and figure out what memory has 
leaked. (alloctrace.py)
5) simple test program (test_utrace.c)

[…]

Have you looked at the heap profiling functionality built into jemalloc?  It's not 
currently enabled on FreeBSD, but as far as I know, the only issue keeping it from 
being useful is the absence of a Linux-compatible /proc/pid/maps (and the 
gperftools folks may already have a solution for that; I haven't looked).  I think it 
makes more sense to get that sorted out than to develop a separate trace-based leak 
checker.  The problem with tracing is that it doesn't scale beyond some relatively 
small number of allocator events.


I have looked at some of this functionality (heap profiling) but alas it 
is not implemented yet.  In addition the dtrace work appears to be quite 
away from a workable solution with too many performance penalties until 
some serious hacking is done.


I am just not sure how to proceed, on one hand I do not really have the 
skill to fix the /proc/pid/maps problem, nor figure out how to get 
dtrace into the system in any time frame that is reasonable.


All a few of us need is the addition of the trace back into the existing 
utrace framework.



Is it time to start installing with some form of debug symbols? This would help 
us also with dtrace.

Re: debug symbols, frame pointers, etc. necessary to make userland dtrace work 
by default, IMO we should strongly prefer such defaults.  It's more reasonable 
to expect people who need every last bit of performance to remove functionality 
than to expect people who want to figure out what the system is doing to figure 
out what functionality to turn on.



This is very true.  I'm going to continue to work towards this end with 
a few people and get up to speed on it so that hopefully we can get to 
this point hopefully in the next release cycle or two.


If you have a few moments, can you have a look at the utrace2 branches 
here:

https://github.com/alfredperlstein/freebsd/tree/utrace2

This branch contains the addition of the utrace2 system call which is 
needed to structure data via utrace(2).  The point of this is to avoid 
kdump(1) needing to discern type of ktrace records based on arbitrary 
size or other parameters and introduces an extensible protocol for new 
types of utrace data.


The utrace2 branch here augments jemalloc to use utrace2 to pass the old 
utrace records, but in addition to pass the return address along with 
the type and size of the allocation:

https://github.com/alfredperlstein/jemalloc/tree/utrace2

-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: malloc+utrace, tracking memory leaks in a running program.

2013-01-09 Thread Alfred Perlstein

On 1/10/13 1:41 AM, Alfred Perlstein wrote:

On 12/23/12 12:28 PM, Jason Evans wrote:

On Dec 21, 2012, at 7:37 PM, Alfred Perlstein bri...@mu.org wrote:
So the other day in an effort to debug a memory leak I decided to 
take a look at malloc+utrace(2) and decided to make a tool to debug 
where leaks are coming from.


A few hours later I have:
1) a new version of utrace(2) (utrace2(2)) that uses structured data 
to prevent overloading of data. (utrace2.diff)
2) changes to ktrace and kdump to decode the new format. (also in 
utrace2.diff)
3) changes to jemalloc to include the new format AND the function 
caller so it's easy to get the source of the leaks. (also in 
utrace2.diff)
4) a program that can take a pipe of kdump(1) and figure out what 
memory has leaked. (alloctrace.py)

5) simple test program (test_utrace.c)

[…]
Have you looked at the heap profiling functionality built into 
jemalloc?  It's not currently enabled on FreeBSD, but as far as I 
know, the only issue keeping it from being useful is the absence of a 
Linux-compatible /proc/pid/maps (and the gperftools folks may 
already have a solution for that; I haven't looked).  I think it 
makes more sense to get that sorted out than to develop a separate 
trace-based leak checker.  The problem with tracing is that it 
doesn't scale beyond some relatively small number of allocator events.


I have looked at some of this functionality (heap profiling) but alas 
it is not implemented yet.  In addition the dtrace work appears to be 
quite away from a workable solution with too many performance 
penalties until some serious hacking is done.


I am just not sure how to proceed, on one hand I do not really have 
the skill to fix the /proc/pid/maps problem, nor figure out how to get 
dtrace into the system in any time frame that is reasonable.


All a few of us need is the addition of the trace back into the 
existing utrace framework.


Is it time to start installing with some form of debug symbols? This 
would help us also with dtrace.
Re: debug symbols, frame pointers, etc. necessary to make userland 
dtrace work by default, IMO we should strongly prefer such defaults.  
It's more reasonable to expect people who need every last bit of 
performance to remove functionality than to expect people who want to 
figure out what the system is doing to figure out what functionality 
to turn on.




This is very true.  I'm going to continue to work towards this end 
with a few people and get up to speed on it so that hopefully we can 
get to this point hopefully in the next release cycle or two.


If you have a few moments, can you have a look at the utrace2 
branches here:

https://github.com/alfredperlstein/freebsd/tree/utrace2

This branch contains the addition of the utrace2 system call which is 
needed to structure data via utrace(2).  The point of this is to avoid 
kdump(1) needing to discern type of ktrace records based on arbitrary 
size or other parameters and introduces an extensible protocol for new 
types of utrace data.


The utrace2 branch here augments jemalloc to use utrace2 to pass the 
old utrace records, but in addition to pass the return address along 
with the type and size of the allocation:

https://github.com/alfredperlstein/jemalloc/tree/utrace2

-Alfred


Jason,

Here are more convenient links that give diffs against FreeBSD and 
jemalloc for the proposed changes:


FreeBSD:
https://github.com/alfredperlstein/freebsd/compare/13e7228d5b83c8fcfc63a0803a374212018f6b68~1...utrace2

jemalloc:
https://github.com/alfredperlstein/jemalloc/compare/master...utrace2

-Alfred

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [PATCH] Add WITH_DEBUG_FILES knob to enable separate debug files

2012-12-24 Thread Alfred Perlstein

On 12/23/12 8:42 PM, Mark Johnston wrote:

I have an extension of Ed's patch which handles bsd.prog.mk at
http://people.freebsd.org/~markj/patches/debug_symbols/debug_symbols_full.patch


I am probably doing something wrong, but I am still getting that build 
error:



BFD: __vdso_gettimeofday.So: invalid SHT_GROUP entry
BFD: __vdso_gettimeofday.So: invalid SHT_GROUP entry
BFD: __vdso_gettimeofday.So: no group info for section 
.text.__vdso_gettimeofday
BFD: __vdso_gettimeofday.So: no group info for section 
.text.__vdso_clock_gettime



with the following flags set:


~/freebsd % cat /etc/make.conf
KERNCONF?=VBOX_NOWITNESS
# added by use.perl 2012-11-05 11:50:32
PERL_VERSION=5.14.2
.(02:35:34)(alfred@spigot)
~/freebsd % cat /etc/src.conf
WITH_CTF=1
WITH_DEBUG_FILES=1
.(02:35:37)(alfred@spigot)

/usr/home/alfred/freebsd # printenv
_=/usr/bin/printenv
SCRIPT=typescript
COVERITY_WORKDIR=/root/coverity-output
COVERITY_DATADIR=/root/coverity-database
COVERITY_UNSUPPORTED=1
PAGER=more
PKG_CONFIG=/p/lib/pkgconfig
ZPROFILE_SOURCED=yes
CVS_RSH=ssh
EDITOR=vi
PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/games:/usr/local/sbin:/usr/local/bin:/root/bin
FTP_PASSIVE_MODE=yes
COWPATH=cows
MAILDIR=/root/Maildir
IRCNAME=Alfred Perlstein alfred at freebsd.org
CVSROOT=/home/ncvs
BLOCKSIZE=1K
OLDPWD=/usr/home/alfred/freebsd
PWD=/usr/home/alfred/freebsd
SHLVL=2
TERM=screen
SHELL=/usr/local/bin/zsh
MAIL=/var/mail/root
LOGNAME=root
USER=root
USERNAME=root
HOME=/root
SUDO_COMMAND=/usr/local/bin/zsh
SUDO_USER=alfred
SUDO_UID=501
SUDO_GID=501


http://people.freebsd.org/~alfred/dtrace/markjdb.build.xz



Thanks!
-Garrett

PS Awesome solution for the ports issue :).
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org



___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: malloc+utrace, tracking memory leaks in a running program.

2012-12-23 Thread Alfred Perlstein

On 12/23/12 9:28 AM, Jason Evans wrote:

On Dec 21, 2012, at 7:37 PM, Alfred Perlstein bri...@mu.org wrote:

So the other day in an effort to debug a memory leak I decided to take a look 
at malloc+utrace(2) and decided to make a tool to debug where leaks are coming 
from.

A few hours later I have:
1) a new version of utrace(2) (utrace2(2)) that uses structured data to prevent 
overloading of data.   (utrace2.diff)
2) changes to ktrace and kdump to decode the new format. (also in utrace2.diff)
3) changes to jemalloc to include the new format AND the function caller so 
it's easy to get the source of the leaks. (also in utrace2.diff)
4) a program that can take a pipe of kdump(1) and figure out what memory has 
leaked. (alloctrace.py)
5) simple test program (test_utrace.c)

[…]

Have you looked at the heap profiling functionality built into jemalloc?  It's not 
currently enabled on FreeBSD, but as far as I know, the only issue keeping it from 
being useful is the absence of a Linux-compatible /proc/pid/maps (and the 
gperftools folks may already have a solution for that; I haven't looked).  I think it 
makes more sense to get that sorted out than to develop a separate trace-based leak 
checker.  The problem with tracing is that it doesn't scale beyond some relatively 
small number of allocator events.

Ok, we are in agreement on this all.

Paul Saab recommended profiling to me, but yes, the problem is that none 
of this stuff works on FreeBSD out of the box due to missing bits here 
or there.  Augmenting the existing utrace stuff to get what I needed 
seemed much simpler than figuring out how to get dtrace, pidmaps and 
whatnot into the system.  It's a matter of the requirements to 
accomplish these higher order things requires 
X=(skill+time+ability_to_socialize_these_changes) where X  
alfred-skill_and_time_and_socialize().  :)


To be honest, if dtrace just worked, then I could get the same 
information I'm getting from utrace2(2) from dtrace with no problem.  
(at least I think so).


As far as scaling it, I agree it does not work for long running 
programs, however there are a few instances of programs leaking large 
memory in a short while that I can track down by temporarily ktracing 
for short while.



Is it time to start installing with some form of debug symbols? This would help 
us also with dtrace.

Re: debug symbols, frame pointers, etc. necessary to make userland dtrace work 
by default, IMO we should strongly prefer such defaults.  It's more reasonable 
to expect people who need every last bit of performance to remove functionality 
than to expect people who want to figure out what the system is doing to figure 
out what functionality to turn on.

Yes!!! :)

Is there an easy way to go about this?

Rui says it's really a matter of just turning off stripping of shlibs 
and adding -fno-omit-frame-pointer and WITH_CTF.


I'm going to give this a shot, if it works, can you help me refine this?

I'll post diffs later today if I don't get completely stuck somehow.

-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [PATCH] Add WITH_DEBUG_FILES knob to enable separate debug files

2012-12-23 Thread Alfred Perlstein

On 12/23/12 1:47 PM, Garrett Cooper wrote:

On Sun, Dec 23, 2012 at 8:26 AM, Ed Maste ema...@freebsd.org wrote:

On 22 December 2012 23:13, Alfred Perlstein bri...@mu.org wrote:

I have a patch for this.  I am building world to see what happens, if you
want to try it, or comment on it, please let me know.

Changes are:
   base DEBUGDIR on LIBDIR for ports
   create intermediate directories for debug objs.

Note that just moving ports debug data to /usr/local/lib/debug/...
won't work since GDB won't search there.  We could teach it to search
a list of paths and include /usr/local/lib/debug and /usr/lib/debug,
or perhaps a symlink under /usr/local/lib.

We could also use a .debug subdirectory for ports and other users of
bsd.lib.mk - so for example /usr/local/lib/libfoo.so would have debug
info in /usr/local/lib/.debug/libfoo.so.debug.

 Crazy idea: why not just provide the user with an example .gdbinit
that does these things? For Isilon it makes more sense to tack on
additional paths (which we already do in our internal directions), and
others potentially are doing similar *shrugs*, so as long as the
example makes sense, I'd stick with it.
 I would probably setup things in such a way that the old default
is kept though because I'm sure that there's someone out there that's
using it (even it it's not *the best* default per how we prefix things
in ports).
It's not a crazy idea, it's pretty good, however what it really turns 
into is another not out of the box working easy thing.


Stuff like this should just work.  And if it doesn't, then we need to 
think harder about it.


-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [PATCH] Add WITH_DEBUG_FILES knob to enable separate debug files

2012-12-23 Thread Alfred Perlstein

On 12/23/12 7:20 PM, Garrett Cooper wrote:

On Sat, Dec 22, 2012 at 9:42 PM, Alfred Perlstein bri...@mu.org wrote:

...
lfred/freebsd/tmp/usr/bin/ld: __vdso_gettimeofday.So:
invalid SHT_GROUP entry
/usr/obj/usr/home/alfred/freebsd/tmp/usr/bin/ld: __vdso_gettimeofday.So: no
group info for section .text.__vdso_gettimeofday
/usr/obj/usr/home/alfred/freebsd/tmp/usr/bin/ld: __vdso_gettimeofday.So: no
group info for section .text.__vdso_clock_gettime
/usr/obj/usr/home/alfred/freebsd/tmp/usr/bin/ld: __vdso_gettimeofday.So:
unknown [0] section `' in group [__vdso_gettimeofday]
/usr/obj/usr/home/alfred/freebsd/tmp/usr/bin/ld: __vdso_gettimeofday.So:
unknown [0] section `' in group [__vdso_clock_gettime]
__vdso_gettimeofday.So: file not recognized: File format not recognized
cc: error: linker command failed with exit code 1 (use -v to see invocation)
*** [libc.so.7.full] Error code 1
1 error
*** [lib/libc__L] Error code 2
1 error
*** [libraries] Error code 2
1 error
*** [_libraries] Error code 2
1 error
*** [buildworld] Error code 2
1 error
 Is RANLIB not being called properly (for lack of a better example:
https://forums.oracle.com/forums/thread.jspa?messageID=8438118 )?
Cheers,
-Garrett

I had WITH_CTF=yes in my /etc/make.conf so this appears to be fallout 
of WITH_CTF=yes.  Basically this seems incompatible with dtrace unless I 
figure it out further.


-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [PATCH] Add WITH_DEBUG_FILES knob to enable separate debug files

2012-12-23 Thread Alfred Perlstein

On 12/23/12 8:36 PM, Mark Johnston wrote:



On Dec 23, 2012 10:55 PM, Alfred Perlstein bri...@mu.org 
mailto:bri...@mu.org wrote:


 On 12/23/12 7:20 PM, Garrett Cooper wrote:

 On Sat, Dec 22, 2012 at 9:42 PM, Alfred Perlstein bri...@mu.org 
mailto:bri...@mu.org wrote:


 ...
 lfred/freebsd/tmp/usr/bin/ld: __vdso_gettimeofday.So:
 invalid SHT_GROUP entry
 /usr/obj/usr/home/alfred/freebsd/tmp/usr/bin/ld: 
__vdso_gettimeofday.So: no

 group info for section .text.__vdso_gettimeofday
 /usr/obj/usr/home/alfred/freebsd/tmp/usr/bin/ld: 
__vdso_gettimeofday.So: no

 group info for section .text.__vdso_clock_gettime
 /usr/obj/usr/home/alfred/freebsd/tmp/usr/bin/ld: 
__vdso_gettimeofday.So:

 unknown [0] section `' in group [__vdso_gettimeofday]
 /usr/obj/usr/home/alfred/freebsd/tmp/usr/bin/ld: 
__vdso_gettimeofday.So:

 unknown [0] section `' in group [__vdso_clock_gettime]
 __vdso_gettimeofday.So: file not recognized: File format not recognized
 cc: error: linker command failed with exit code 1 (use -v to see 
invocation)

 *** [libc.so.7.full] Error code 1
 1 error
 *** [lib/libc__L] Error code 2
 1 error
 *** [libraries] Error code 2
 1 error
 *** [_libraries] Error code 2
 1 error
 *** [buildworld] Error code 2
 1 error
  Is RANLIB not being called properly (for lack of a better example:
 https://forums.oracle.com/forums/thread.jspa?messageID=8438118 )?
 Cheers,
 -Garrett

 I had WITH_CTF=yes in my /etc/make.conf so this appears to be 
fallout of WITH_CTF=yes.  Basically this seems incompatible with 
dtrace unless I figure it out further.


I have WITH_CTF=YES in src.conf and I've been able to complete a
buildworld+installworld with an extension of this patch. The errors 
are also occurring
before objcopy has done anything, so I don't see why they might be 
related.


care to post the patch?  I rm-rf'd all of /usr/obj and tried twice 
before removing WITH_CTF to get through it.


-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: malloc+utrace, tracking memory leaks in a running program.

2012-12-22 Thread Alfred Perlstein

On 12/22/12 8:56 AM, Ed Maste wrote:

On 21 December 2012 22:37, Alfred Perlstein bri...@mu.org wrote:


Is it time to start installing with some form of debug symbols? This would
help us also with dtrace.

I just posted a patch to add a knob to build and install standalone
debug files.  My intent is that we will build releases with this
enabled, and add a base-dbg.txz distribution that contains the debug
data for the base system, so that one can install it along with
everything else, or add it later on when needed.

We could perhaps teach dtrace to read its data from standalone .ctf
files, or have it read DWARF directly and use the same debug files.


Thank you.

Added CC'd Rui Paulo.  Rui, do you think it's easy to get dtrace to 
honor these conventions?


-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [PATCH] Add WITH_DEBUG_FILES knob to enable separate debug files

2012-12-22 Thread Alfred Perlstein

On 12/22/12 8:46 AM, Ed Maste wrote:

When this knob is set standalone debug files for shared objects are
built and installed in /usr/lib/debug/so pathname.debug.  GDB
searches this path for debug data.

The -g flag is automatically added to CFLAGS if debug files are enabled
(but the shared objects are still installed stripped, if DEBUG_FLAGS is
not set).
---
This is a refinement of my earlier change for shared object standalone
debug.  This patch also includes the following changes:

- Change GDB's standalone debug file path to the default /usr/lib/debug.

- Change debug file extension from 'symbols' to 'debug', in line with
   GDB's documentation.  I initially followed the kernel build example
   in choosing .symbols, but .debug more accurately represents the use
   of these files.

This looks promising.  After this patch, running gdb 
./a.linked.with.libc.so.out should work without any extra thinking?


I'll give it a shot.

We should enable this flag by default.  A big FreeBSD strength is our 
debugging system.


-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [PATCH] Add WITH_DEBUG_FILES knob to enable separate debug files

2012-12-22 Thread Alfred Perlstein

On 12/22/12 6:14 PM, Jan Beich wrote:

Ed Maste ema...@freebsd.org writes:


When this knob is set standalone debug files for shared objects are
built and installed in /usr/lib/debug/so pathname.debug.  GDB
searches this path for debug data.

[...]
What about ports? They are not allowed to install outside of PREFIX.

   $ cd multimedia/cuse4bsd-kmod
   $ make install PREFIX=/tmp/aaa PKG_DBDIR=/tmp/pkg WITH_DEBUG=
   [...]
   install -C -o root -g wheel -m 444   libcuse4bsd.a /tmp/aaa/lib
   install -s -o root -g wheel -m 444 libcuse4bsd.so.1 /tmp/aaa/lib
   install -o root -g wheel -m 444libcuse4bsd.so.1.debug 
/usr/lib/debug/tmp/aaa/lib
   install: /usr/lib/debug/tmp/aaa/lib: No such file or directory
   *** [_libinstall] Error code 71
I have a patch for this.  I am building world to see what happens, if 
you want to try it, or comment on it, please let me know.


Changes are:
  base DEBUGDIR on LIBDIR for ports
  create intermediate directories for debug objs.

-Alfred
diff --git a/etc/mtree/BSD.usr.dist b/etc/mtree/BSD.usr.dist
index 336d055..7b428d5 100644
--- a/etc/mtree/BSD.usr.dist
+++ b/etc/mtree/BSD.usr.dist
@@ -18,6 +18,22 @@
 aout
 ..
 ..
+debug
+boot
+..
+lib
+geom
+..
+..
+usr
+lib
+engines
+..
+..
+lib32
+..
+..
+..
 dtrace
 ..
 engines
diff --git a/gnu/usr.bin/gdb/arch/amd64/config.h 
b/gnu/usr.bin/gdb/arch/amd64/config.h
index ac81c54..ae3e104 100644
--- a/gnu/usr.bin/gdb/arch/amd64/config.h
+++ b/gnu/usr.bin/gdb/arch/amd64/config.h
@@ -440,7 +440,7 @@
 #define PACKAGE gdb
 
 /* Global directory for separate debug files.  */
-#define DEBUGDIR /usr/local/lib/debug
+#define DEBUGDIR /usr/lib/debug
 
 /* Define to BFD's default architecture.  */
 #define DEFAULT_BFD_ARCH bfd_i386_arch
diff --git a/gnu/usr.bin/gdb/arch/arm/config.h 
b/gnu/usr.bin/gdb/arch/arm/config.h
index e1b128c..26b1891 100644
--- a/gnu/usr.bin/gdb/arch/arm/config.h
+++ b/gnu/usr.bin/gdb/arch/arm/config.h
@@ -452,7 +452,7 @@
 #define PACKAGE gdb
 
 /* Global directory for separate debug files.  */
-#define DEBUGDIR /usr/local/lib/debug
+#define DEBUGDIR /usr/lib/debug
 
 /* Define to BFD's default architecture.  */
 #define DEFAULT_BFD_ARCH bfd_arm_arch
diff --git a/gnu/usr.bin/gdb/arch/i386/config.h 
b/gnu/usr.bin/gdb/arch/i386/config.h
index f21da4c..30e75c3 100644
--- a/gnu/usr.bin/gdb/arch/i386/config.h
+++ b/gnu/usr.bin/gdb/arch/i386/config.h
@@ -440,7 +440,7 @@
 #define PACKAGE gdb
 
 /* Global directory for separate debug files.  */
-#define DEBUGDIR /usr/local/lib/debug
+#define DEBUGDIR /usr/lib/debug
 
 /* Define to BFD's default architecture.  */
 #define DEFAULT_BFD_ARCH bfd_i386_arch
diff --git a/gnu/usr.bin/gdb/arch/ia64/config.h 
b/gnu/usr.bin/gdb/arch/ia64/config.h
index 5faa96b..f8c84ab 100644
--- a/gnu/usr.bin/gdb/arch/ia64/config.h
+++ b/gnu/usr.bin/gdb/arch/ia64/config.h
@@ -440,7 +440,7 @@
 #define PACKAGE gdb
 
 /* Global directory for separate debug files.  */
-#define DEBUGDIR /usr/local/lib/debug
+#define DEBUGDIR /usr/lib/debug
 
 /* Define to BFD's default architecture.  */
 #define DEFAULT_BFD_ARCH bfd_ia64_arch
diff --git a/gnu/usr.bin/gdb/arch/mips/config.h 
b/gnu/usr.bin/gdb/arch/mips/config.h
index 41a6731..01a7869 100644
--- a/gnu/usr.bin/gdb/arch/mips/config.h
+++ b/gnu/usr.bin/gdb/arch/mips/config.h
@@ -440,7 +440,7 @@
 #define PACKAGE gdb
 
 /* Global directory for separate debug files.  */
-#define DEBUGDIR /usr/local/lib/debug
+#define DEBUGDIR /usr/lib/debug
 
 /* Define to BFD's default architecture.  */
 #define DEFAULT_BFD_ARCH bfd_mips_arch
diff --git a/gnu/usr.bin/gdb/arch/powerpc/config.h 
b/gnu/usr.bin/gdb/arch/powerpc/config.h
index f169fad..49472e7 100644
--- a/gnu/usr.bin/gdb/arch/powerpc/config.h
+++ b/gnu/usr.bin/gdb/arch/powerpc/config.h
@@ -440,7 +440,7 @@
 #define PACKAGE gdb
 
 /* Global directory for separate debug files.  */
-#define DEBUGDIR /usr/local/lib/debug
+#define DEBUGDIR /usr/lib/debug
 
 /* Define to BFD's default architecture.  */
 #define DEFAULT_BFD_ARCH bfd_rs6000_arch
diff --git a/gnu/usr.bin/gdb/arch/powerpc64/config.h 
b/gnu/usr.bin/gdb/arch/powerpc64/config.h
index d8b9b6d..c904d1d 100644
--- a/gnu/usr.bin/gdb/arch/powerpc64/config.h
+++ b/gnu/usr.bin/gdb/arch/powerpc64/config.h
@@ -440,7 +440,7 @@
 #define PACKAGE gdb
 
 /* Global directory for separate debug files.  */
-#define DEBUGDIR /usr/local/lib/debug
+#define DEBUGDIR /usr/lib/debug
 
 /* Define to BFD's default architecture.  */
 #define DEFAULT_BFD_ARCH bfd_rs6000_arch
diff --git a/gnu/usr.bin/gdb/arch/sparc64/config.h 
b/gnu/usr.bin/gdb/arch/sparc64/config.h
index 5527a79..ff87c28 100644
--- a/gnu/usr.bin/gdb/arch/sparc64/config.h
+++ b/gnu/usr.bin/gdb/arch/sparc64/config.h
@@ -440,7 +440,7 @@
 

Re: [PATCH] Add WITH_DEBUG_FILES knob to enable separate debug files

2012-12-22 Thread Alfred Perlstein

On 12/22/12 8:13 PM, Alfred Perlstein wrote:

On 12/22/12 6:14 PM, Jan Beich wrote:

Ed Maste ema...@freebsd.org writes:


When this knob is set standalone debug files for shared objects are
built and installed in /usr/lib/debug/so pathname.debug.  GDB
searches this path for debug data.

[...]
What about ports? They are not allowed to install outside of PREFIX.

   $ cd multimedia/cuse4bsd-kmod
   $ make install PREFIX=/tmp/aaa PKG_DBDIR=/tmp/pkg WITH_DEBUG=
   [...]
   install -C -o root -g wheel -m 444   libcuse4bsd.a /tmp/aaa/lib
   install -s -o root -g wheel -m 444 libcuse4bsd.so.1 /tmp/aaa/lib
   install -o root -g wheel -m 444libcuse4bsd.so.1.debug 
/usr/lib/debug/tmp/aaa/lib

   install: /usr/lib/debug/tmp/aaa/lib: No such file or directory
   *** [_libinstall] Error code 71
I have a patch for this.  I am building world to see what happens, if 
you want to try it, or comment on it, please let me know.


Changes are:
  base DEBUGDIR on LIBDIR for ports
  create intermediate directories for debug objs.

-Alfred
Buildworld breaks with this for me using my patch, no idea what this all 
means yet.  Will look into it:


exer.c -o nslexer.So
ctfconvert -L VERSION crypt_xdr.o
ctfconvert -L VERSION crypt_clnt.o
ctfconvert -L VERSION nsparser.So
ctfconvert -L VERSION nsparser.o
ctfconvert -L VERSION crypt_clnt.So
ctfconvert -L VERSION crypt_xdr.So
ctfconvert -L VERSION yp_xdr.o
ctfconvert -L VERSION subr_acl_nfs4.So
ctfconvert -L VERSION nslexer.o
ctfconvert -L VERSION yp_xdr.So
ctfconvert -L VERSION subr_acl_nfs4.o
building static c library
ctfconvert -L VERSION nslexer.So
building shared library libc.so.7
building special pic c library
BFD: dlfcn.o: invalid SHT_GROUP entry
BFD: dlfcn.o: invalid SHT_GROUP entry
BFD: dlfcn.o: invalid SHT_GROUP entry
BFD: dlfcn.o: invalid SHT_GROUP entry
BFD: dlfcn.o: no group info for section .text._rtld_error
BFD: dlfcn.o: no group info for section .text.dladdr
BFD: dlfcn.o: no group info for section .text.dlclose
BFD: dlfcn.o: no group info for section .text.dlerror
BFD: dlfcn.o: no group info for section .text.dllockinit
BFD: dlfcn.o: no group info for section .text.dlopen
BFD: dlfcn.o: no group info for section .text.dlsym
BFD: dlfcn.o: no group info for section .text.dlvsym
BFD: dlfcn.o: no group info for section .text.dlinfo
BFD: dlfcn.o: no group info for section .text._rtld_thread_init
BFD: dlfcn.o: no group info for section .text._rtld_atfork_post
BFD: dlfcn.o: no group info for section .text._rtld_get_stack_prot
nm: dlfcn.o: File format not recognized
BFD: elf_utils.o: invalid SHT_GROUP entry
BFD: elf_utils.o: no group info for section .text.__pthread_map_stacks_exec
BFD: elf_utils.o: unknown [0] section `' in group 
[__pthread_map_stacks_exec]

nm: elf_utils.o: File format not recognized
ranlib libc_pic.a
BFD: __vdso_gettc.o: invalid SHT_GROUP entry
BFD: __vdso_gettc.o: no group info for section .text.__vdso_gettc
BFD: __vdso_gettc.o: unknown [0] section `' in group [__vdso_gettc]
nm: __vdso_gettc.o: File format not recognized
BFD: __vdso_gettimeofday.o: invalid SHT_GROUP entry
BFD: __vdso_gettimeofday.o: invalid SHT_GROUP entry
BFD: __vdso_gettimeofday.o: no group info for section 
.text.__vdso_gettimeofday
BFD: __vdso_gettimeofday.o: no group info for section 
.text.__vdso_clock_gettime
BFD: __vdso_gettimeofday.o: unknown [0] section `' in group 
[__vdso_gettimeofday]
BFD: __vdso_gettimeofday.o: unknown [0] section `' in group 
[__vdso_clock_gettime]

nm: __vdso_gettimeofday.o: File format not recognized
BFD: dlfcn.So: invalid SHT_GROUP entry
BFD: dlfcn.So: invalid SHT_GROUP entry
BFD: dlfcn.So: invalid SHT_GROUP entry
BFD: dlfcn.So: invalid SHT_GROUP entry
BFD: dlfcn.So: no group info for section .text._rtld_error
BFD: dlfcn.So: no group info for section .text.dladdr
BFD: dlfcn.So: no group info for section .text.dlclose
BFD: dlfcn.So: no group info for section .text.dlerror
BFD: dlfcn.So: no group info for section .text.dllockinit
BFD: dlfcn.So: no group info for section .text.dlopen
BFD: dlfcn.So: no group info for section .text.dlsym
BFD: dlfcn.So: no group info for section .text.dlvsym
BFD: dlfcn.So: no group info for section .text.dlinfo
BFD: dlfcn.So: no group info for section .text._rtld_thread_init
BFD: dlfcn.So: no group info for section .text._rtld_atfork_post
BFD: dlfcn.So: no group info for section .text._rtld_get_stack_prot
BFD: dlfcn.So: unknown [2] section `.symtab' in group [dl_iterate_phdr]
BFD: dlfcn.So: unknown [0] section `' in group [_rtld_atfork_pre]
BFD: dlfcn.So: unknown [0] section `' in group [_rtld_atfork_post]
BFD: dlfcn.So: unknown [0] section `' in group [_rtld_addr_phdr]
BFD: dlfcn.So: unknown [0] section `' in group [_rtld_get_stack_prot]
nm: dlfcn.So: File format not recognized
BFD: elf_utils.So: invalid SHT_GROUP entry
BFD: elf_utils.So: no group info for section .text.__pthread_map_stacks_exec
BFD: elf_utils.So: unknown [0] section `' in group 
[__pthread_map_stacks_exec]

nm: elf_utils.So: File format

malloc+utrace, tracking memory leaks in a running program.

2012-12-21 Thread Alfred Perlstein

Hey guys.

So the other day in an effort to debug a memory leak I decided to take a 
look at malloc+utrace(2) and decided to make a tool to debug where leaks 
are coming from.


A few hours later I have:
1) a new version of utrace(2) (utrace2(2)) that uses structured data to 
prevent overloading of data.   (utrace2.diff)
2) changes to ktrace and kdump to decode the new format. (also in 
utrace2.diff)
3) changes to jemalloc to include the new format AND the function caller 
so it's easy to get the source of the leaks. (also in utrace2.diff)
4) a program that can take a pipe of kdump(1) and figure out what memory 
has leaked. (alloctrace.py)

5) simple test program (test_utrace.c)

If you want to get a trace now you can do this:
gcc -Wall -O ./test_utrace.c

env MALLOC_CONF='utrace:true' ktrace ./a.out
kdump | ./alloctrace.py


Now the problem I am having is making this work on a running program:
1) turning on the opt_utrace in a running program is almost 
impossible.  This is because libc is installed stripped. Unfortunately 
my gdb-foo is weak and I was unable to load the symbol file without a 
really bad hack.


The only way I could get it done was to use a trick from Ed Maste which 
was to:

  1.1) install a debug copy of libc.so over the installed one. - dislike!
  1.2) then launching gdb ./a.out pid,
  1.3) then set __jemalloc_opt_utrace = 1
  1.4) enable ktrace on the running binary: ktrace -p pid -t U  # 
this is utrace2 enabled

  1.5) run 'cont' in gdb to proceed.

There has to be an easier way to access the symbol __jemalloc_opt_utrace 
besides copying over the installed libc.


Is there a workaround for 1.1?

Is it time to start installing with some form of debug symbols? This 
would help us also with dtrace.


Ideas?

-Alfred

Index: contrib/jemalloc/src/jemalloc.c
===
--- contrib/jemalloc/src/jemalloc.c (revision 244503)
+++ contrib/jemalloc/src/jemalloc.c (working copy)
@@ -78,20 +78,26 @@
 static malloc_mutex_t  init_lock = MALLOC_MUTEX_INITIALIZER;
 #endif
 
+#ifdef JEMALLOC_UTRACE
 typedef struct {
+   int ut_type;/* utrace type UTRACE_MALLOC */
+   int ut_version; /* utrace malloc version */
void*p; /* Input pointer (as in realloc(p, s)). */
size_t  s;  /* Request size. */
void*r; /* Result pointer. */
+   void*ut_caller;/* Caller */
 } malloc_utrace_t;
 
-#ifdef JEMALLOC_UTRACE
 #  define UTRACE(a, b, c) do { \
if (opt_utrace) {   \
malloc_utrace_t ut; \
+   ut.ut_type = UTRACE_MALLOC; \
+   ut.ut_version = 2;  \
ut.p = (a); \
ut.s = (b); \
ut.r = (c); \
-   utrace(ut, sizeof(ut));\
+   ut.ut_caller = __builtin_return_address(0); \
+   utrace2(ut, sizeof(ut));   \
}   \
 } while (0)
 #else
@@ -975,7 +981,6 @@
}
if (config_prof  opt_prof  result != NULL)
prof_malloc(result, usize, cnt);
-   UTRACE(0, size, result);
return (ret);
 }
 
@@ -985,6 +990,7 @@
int ret = imemalign(memptr, alignment, size, sizeof(void *));
JEMALLOC_VALGRIND_MALLOC(ret == 0, *memptr, isalloc(*memptr,
config_prof), false);
+   UTRACE(0, size, *memptr);
return (ret);
 }
 
@@ -1000,6 +1006,7 @@
}
JEMALLOC_VALGRIND_MALLOC(err == 0, ret, isalloc(ret, config_prof),
false);
+   UTRACE(0, size, ret);
return (ret);
 }
 
@@ -1265,6 +1272,7 @@
void *ret JEMALLOC_CC_SILENCE_INIT(NULL);
imemalign(ret, alignment, size, 1);
JEMALLOC_VALGRIND_MALLOC(ret != NULL, ret, size, false);
+   UTRACE(0, size, ret);
return (ret);
 }
 #endif
@@ -1276,6 +1284,7 @@
void *ret JEMALLOC_CC_SILENCE_INIT(NULL);
imemalign(ret, PAGE, size, 1);
JEMALLOC_VALGRIND_MALLOC(ret != NULL, ret, size, false);
+   UTRACE(0, size, ret);
return (ret);
 }
 #endif
Index: sys/compat/freebsd32/syscalls.master
===
--- sys/compat/freebsd32/syscalls.master(revision 244503)
+++ sys/compat/freebsd32/syscalls.master(working copy)
@@ -1004,4 +1004,5 @@
int *status, int options, \
struct wrusage32 *wrusage, \
siginfo_t *info); }
+533 AUE_NULLSTD { int 

why is kern.maxproc not read/write?

2012-12-11 Thread Alfred Perlstein

Eitan was asking me to update the FAQ section 5.7:


*5.7.* Why do I get the error kernel: proc: table is full?

That error is no longer relevant, but I also seemed to find out 
something else interesting..


Been grepping through the code, and it seems like the only side-effect 
of maxproc changing would be overcrowding the hash table tidhashtbl and 
pidhashtbl.


I can't see anything that's statically allocated any longer.

The only bad thing is that the procs seem to be taken from a 
UMA_ZONE_NOFREE zone, so if the user makes an insanely high value, it 
could be end of days.


Even the MD code seems to use it to size the number of pv entries.

I'm wondering if making this a runtime tunable that has a SYSCTL_PROC 
attached that doesn't allow it to go below some PROC_MIN would be OK.


Am I missing something?

As far as Eitan's question about the FAQ section, the new message 
printed is:

maxproc limit exceeded by uid %i, please see tuning(7) and login.conf(5)

The faq is wrong, and tells the user to change sysctl.conf, where it 
should say to update loader.conf.


-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Possible obscure socket leak when system under load and listener is slow to accept

2012-12-09 Thread Alfred Perlstein

On 12/8/12 5:05 PM, Richard Sharpe wrote:

On Sun, 2012-12-09 at 00:50 +0100, Andre Oppermann wrote:

Hi folks,

Our QA group (at xxx) using Samba and smbtorture has been seeing a
lot of cases where accept returns ECONNABORTED because the system load
is high and Samba has a large listen backlog.

Every now and then we get a crash in smbd or in winbindd and winbindd
complains of too many open files in the system.

In looking at kern_accept, it seems to me that FreeBSD can leak a socket
when kern_accept calls soaccept on it but gets ECONNABORTED. This error
is the only error returned from tcp_usr_accept.

It seems like the socket taken off so_comp is never freed in this case
and that there has been a call on soref on it as well, so that something
like the following is needed in the error path:

 //some-path/freebsd/sys/kern/uipc_syscalls.c#1
- /home/rsharpe/dev-src/packages/freebsd/sys/kern/uipc_syscalls.c 
@@ -433,6 +433,14 @@
  */
 if (name)
 *namelen = 0;
+   /*
+* We need to close the socket we unlinked
+* so we do not leak it.
+*/
+   ACCEPT_LOCK();
+   SOCK_LOCK(so);
+   soclose(so);
 goto noconnection;
 }
 if (sa == NULL) {

I think an soclose is needed at this point because soisconnected has
been called on the socket.

Do you think this analysis is reasonable?

  

We are using FreeBSD 8.0 but it seems the same is true for 9.0. However,
maybe I am wrong since I am not sure if the fdclose call would free the
socket, but a quick look suggested that it doesn't.

The fdclose should properly tear down the file descriptor.  The call
graph is: fdclose() - fdrop() - _fdrop() - fo_close()/soo_close() -
soclose() - sorele() - sofree() - sodealloc().

A socket leak would not count against kern.maxfiles unless the file
descriptor leaks as well.  So it is unlikely that this is the problem.

OK, thanks for the feedback. I will keep looking.


Samba may open a large number of files (real files and sockets) and
you may run into the maxfiles limit.  You can check the limit with
sysctl kern.maxfiles and increase it at boot time in boot/loader.conf
with kern.maxfiles=10 for example.

Well, some of the smbds are dying, but it is possible that there is a
file leak in Samba or our VFS that we are tripping as well.


lsof and sockstat can be helpful.  lsof may be able to help determine if 
there's a leak because it MAY will find sockets not associated with a 
process.


Hope this helps.

-Alfred

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: clang mangling some static struct names?

2012-11-18 Thread Alfred Perlstein

On Nov 18, 2012, at 5:37 AM, Dimitry Andric d...@freebsd.org wrote:

 On 2012-11-16 23:04, Navdeep Parhar wrote:
 On 11/16/12 13:49, Roman Divacky wrote:
 Yes, it does that. iirc so that you can have things like
 
 void foo(int cond) {
   if (cond) {
 static int i = 7;
   } else {
 static int i = 8;
   }
 }
 
 working correctly.
 
 It's not appending the .n everywhere.  And when it does, I don't see any
 potential collision that it prevented by doing so.  Instead, it looks
 like the .n symbol corresponds to the nth element in the structure (so
 this is not name mangling in the true sense).  I just don't see the
 point in doing things this way.  It is only making things harder for
 debuggers.
 
 I don't think the point is making things harder for debuggers, the point
 is optimization.  Since static variables and functions can be optimized
 away, or arbitrarily moved around, you cannot count on those symbols
 being there at all.

Bro, do you even debug?

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [RFQ] make witness panic an option

2012-11-16 Thread Alfred Perlstein

On 11/15/12 11:22 PM, Andriy Gapon wrote:

on 16/11/2012 01:20 Alfred Perlstein said the following:

We need to enable developers to skip these areas and test their own code.

I wish that there was a magic knob to ignore build breakages, so that the
developers could test how their own code compiles :-)
There is, it's called updating to known good tinderbox build and basing 
changes off of that.


On a serious note, why stop here?  E.g. Solaris seems to have knob to ignore all
asserts (just to print a message, but not panic).

There is no reason why not to add such a thing, in fact it would be 
really handy for some of our users who need asserts, but sometimes can't 
clean up the entire code base.


Adding another option to tag asserts so that it was sort of like:

KASSERT((cond, section, string)); would be interesting, then you could 
turn KASSERTS on based on vfs or possibly file by file.


-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [RFQ] make witness panic an option

2012-11-16 Thread Alfred Perlstein

On 11/16/12 10:18 AM, Adrian Chadd wrote:

On 16 November 2012 00:26, Alfred Perlstein bri...@mu.org wrote:


Adding another option to tag asserts so that it was sort of like:

KASSERT((cond, section, string)); would be interesting, then you could
turn KASSERTS on based on vfs or possibly file by file.

That's orthogonal to my developer-focused request. I'm also a big fan
of correctly using asserts/panics - ie, asserts shouldn't replace
correct error handling.
(Yes, I'm guilty of this in ath(4), but I have plans soon to rectify this.)



Adrian
I apologize if you took a wishlist item for me as a request for you to 
take on/augment your patch.


It was not my intention.

Back to your work, I like your patch quite a bit, I am wondering though 
if it can be worked into something under witness_kdb though.


-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [RFQ] make witness panic an option

2012-11-15 Thread Alfred Perlstein

On 11/14/12 10:15 PM, Adrian Chadd wrote:

Hi all,

When debugging and writing wireless drivers/stack code, I like to
sprinkle lots of locking assertions everywhere. However, this does
cause things to panic quite often during active development.

This patch (against stable/9) makes the actual panic itself
configurable. It still prints the message regardless.

This has allowed me to sprinkle more locking assertions everywhere to
investigate whether particular paths have been hit or not. I don't
necessarily want those to panic the kernel.

I'd like everyone to consider this for FreeBSD-HEAD.

Thanks!


Adrian, you seem be getting reluctance on your patch which is surprising 
since we have witness_kdb option which pretty much does exactly what 
you want...


...except where you need it to.  That is unfortunate.

Perhaps if you switched those panics to a WITNESS_WARN that would do 
what you need/want?


You could pass a special flag into WITNESS_WARN that said i'm going to 
pass you a NULL ptr for lock object... just behave as if there was an 
error.


that should make things more concise.

Will that be sufficient?

-Alfred





Adrian


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [RFQ] make witness panic an option

2012-11-15 Thread Alfred Perlstein

On 11/15/12 12:51 PM, Andriy Gapon wrote:

on 15/11/2012 22:00 Adrian Chadd said the following:

But I think my change is invaluable for development, where you want to
improve and debug the locking and lock interactions of a subsystem.

My practical experience was that if you mess up one lock in one place, then it
is a total mess further on.  but apparently you've got a different practical
experience :-)

What would indeed be invaluable to _me_ - if the LOR messages also produced the
stack(s) where a supposedly correct lock order was learned.


Adrian is right.

In a large scale environment breakages will be introduced in places you 
do not have access to.


We need to enable developers to skip these areas and test their own code.

Without Adrian's concept then it forces someone who may have no idea 
about a subsystem to either be blocked, or to have to put his work aside 
to work on a problem that is someone else's responsibility.


I locked down SMP at a large company in a FreeBSD code base and had this 
same problem.  Adrian's patch would have helped all of us tremendously.


Adrian, can you look at my suggestion to merge with witness_kdb and see 
if that will suffice?


-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Memory reserves or lack thereof

2012-11-12 Thread Alfred Perlstein


On Nov 12, 2012, at 4:11 AM, Andre Oppermann an...@freebsd.org wrote:
 
 
 I don't think many places depend on M_NOWAIT digging deep.  I'm
 perfectly happy with having M_NOWAIT give up on first try.  Only
 together with M_TRY_REALLY_HARD it would dig into reserves.
 
 PS: We have a really nasty namespace collision with the mbuf flags
 which use the M_* prefix as well.

Agreed. 

 


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Memory reserves or lack thereof

2012-11-11 Thread Alfred Perlstein
I think very few of the m_nowaits actually need the reserve behavior. We should 
probably switch away from it digging that deep by default and introduce a flag 
and/or a per thread flag to set the behavior. 

Sent from my iPhone

On Nov 11, 2012, at 4:32 PM, Dieter BSD dieter...@engineer.com wrote:

 Alan writes:
 In conclusion, I think it's time that we change M_NOWAIT so that it doesn't
 dig any deeper into the cache/free page queues than M_WAITOK does and
 reintroduce a M_USE_RESERVE-like flag that says dig deep into the
 cache/free page queues.  The trouble is that we then need to identify all
 of those places that are implicitly depending on the current behavior of
 M_NOWAIT also digging deep into the cache/free page queues so that we can
 add an explicit M_USE_RESERVE.
 
 find /usr/src/sys | xargs grep M_NOWAIT | wc -l
 2101
 
 Sounds like a lot of work that would need to happen atomically.
 Would this work:
 
 M_NO_WAIT   do not sleep, do not dig deep unless M_USE_RESERVE also set
 M_USE_RESERVE   dig deep
 M_NOWAITM_NO_WAIT | M_USE_RESERVE (deprecated)
 
 New code avoids using M_NOWAIT. Existing code continues working the same way.
 As time permits, old code is converted to new flags. Eventually M_NOWAIT
 goes away.
 
 Pro: the amount of code that needs to change atomically is much smaller.
 
 Con: (1) Have to remember (or look up) difference between M_NOWAIT
 and M_NO_WAIT. Maybe calling the new flag M_NO_SLEEP would help?
 (2) Would M_NOWAIT really ever go away? The spl() calls haven't,
 even after some cage rattling.
 ___
 freebsd-hackers@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


please review: patch to retain device name for dumpdev.

2012-11-01 Thread Alfred Perlstein

I've always wanted to be able to query the current dump device.

This patch lets me do this.

Poul-Henning, what do you think?  Is there a nicer way?  Perhaps a way 
to include the /dev/$device

as the patch in its current form only stores ada0p3.  I think that is OK.


Provide a device name in the sysctl tree for programs to query the state of 
crashdump target devices.

This will be used to add a -l (ell) flag to dumpon(8) to list the currently 
configured dumpdev.



/usr/9-stable % svn diff sys/dev/null sys/geom sys/kern sys/sys
Index: sys/dev/null/null.c
===
--- sys/dev/null/null.c (revision 242367)
+++ sys/dev/null/null.c (working copy)
@@ -91,7 +91,7 @@
case DIOCSKERNELDUMP:
error = priv_check(td, PRIV_SETDUMPER);
if (error == 0)
-   error = set_dumper(NULL);
+   error = set_dumper(NULL, NULL);
break;
case FIONBIO:
break;
Index: sys/geom/geom_dev.c
===
--- sys/geom/geom_dev.c (revision 242367)
+++ sys/geom/geom_dev.c (working copy)
@@ -351,7 +351,7 @@
case DIOCSKERNELDUMP:
u = *((u_int *)data);
if (!u) {
-   set_dumper(NULL);
+   set_dumper(NULL, NULL);
error = 0;
break;
}
@@ -360,7 +360,7 @@
i = sizeof kd;
error = g_io_getattr(GEOM::kerneldump, cp, i, kd);
if (!error) {
-   error = set_dumper(kd.di);
+   error = set_dumper(kd.di, devtoname(dev));
if (!error)
dev-si_flags |= SI_DUMPDEV;
}
@@ -518,7 +518,7 @@
 
 	/* Reset any dump-area set on this device */

if (dev-si_flags  SI_DUMPDEV)
-   set_dumper(NULL);
+   set_dumper(NULL, NULL);
 
 	/* Destroy the struct cdev *so we get no more requests */

destroy_dev(dev);
Index: sys/kern/kern_shutdown.c
===
--- sys/kern/kern_shutdown.c(revision 242367)
+++ sys/kern/kern_shutdown.c(working copy)
@@ -711,18 +711,28 @@
printf(done\n);
 }
 
+static char dumpdevname[sizeof(((struct cdev*)NULL)-si_name)];

+SYSCTL_STRING(_kern_shutdown, OID_AUTO, dumpdevname, CTLFLAG_RD,
+dumpdevname, 0, Device for kernel dumps);
+
 /* Registration of dumpers */
 int
-set_dumper(struct dumperinfo *di)
+set_dumper(struct dumperinfo *di, const char *devname)
 {
 
 	if (di == NULL) {

bzero(dumper, sizeof dumper);
+   dumpdevname[0] = '\0';
return (0);
}
if (dumper.dumper != NULL)
return (EBUSY);
dumper = *di;
+   strlcpy(dumpdevname, devname, sizeof(dumpdevname));
+   if (strlen(dumpdevname) != strlen(devname)) {
+   printf(set_dumper: device name truncated from '%s' - '%s'\n,
+   devname, dumpdevname);
+   }
return (0);
 }
 
Index: sys/sys/conf.h

===
--- sys/sys/conf.h  (revision 242367)
+++ sys/sys/conf.h  (working copy)
@@ -335,7 +335,7 @@
off_t   mediasize;  /* Space available in bytes. */
 };
 
-int set_dumper(struct dumperinfo *);

+int set_dumper(struct dumperinfo *, const char *_devname);
 int dump_write(struct dumperinfo *, void *, vm_offset_t, off_t, size_t);
 void dumpsys(struct dumperinfo *);
 int doadump(boolean_t);


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: please review: patch to retain device name for dumpdev.

2012-11-01 Thread Alfred Perlstein

On 11/1/12 1:06 AM, Poul-Henning Kamp wrote:


In message 50921b44.20...@ixsystems.com, Alfred Perlstein writes:


Poul-Henning, what do you think?  Is there a nicer way?  Perhaps a way
to include the /dev/$device

I think there are private implemenations where dumpdev is a network thing,
so too much magic string editing is probably not a good idea.

Given that /dev is really just a view into GEOMs namespace, one could
argue for GEOM:ada0p3 that that may be going overboard in sematic
correctness.


Good point, thank you.  I'll leave the patch as-is.

-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


make -jN buildworld on 512MB ram

2012-10-31 Thread Alfred Perlstein
It seems like the new compiler likes to get up to ~200+MB resident when 
building some basic things in our tree.


Unfortunately this causes smaller machines (VMs) to take days because of 
swap thrashing.


Doesn't our make(1) have some stuff to mitigate this?  I would expect it 
to be a bit smarter about detecting the number of swaps/pages/faults of 
its children and taking into account the machine's total ram before 
forking off new processes.  I know gmake has some algorithms, although 
last I checked they were very naive and didn't work well.


Any ideas?  I mean a really simple algorithm could be devised that would 
be better than what we appear to have (which is nothing).


Even if an algorithm can't be come up with, why not something just to 
throttle the max number of c++/g++ processes thrown out.  Maybe I'm 
missing a trick I can pull off with some make.conf knobs?


Idk, summer of code idea?  Anyone mentoring someone they want to have a 
look at this?


-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: make -jN buildworld on 512MB ram

2012-10-31 Thread Alfred Perlstein



On 10/31/12 1:41 PM, Peter Jeremy wrote:

On 2012-Oct-31 12:58:18 -0700, Alfred Perlstein bri...@mu.org wrote:

It seems like the new compiler likes to get up to ~200+MB resident when
building some basic things in our tree.

The killer I found was the ctfmerge(1) on the kernel - which exceeds
~400MB on i386.  Under low RAM, that fails _without_ reporting any
errors back to make(1), resulting in a corrupt new kernel (it booted
but had virtually no devices so it couldn't find root).

Trolled by FreeBSD. :)



Doesn't our make(1) have some stuff to mitigate this?  I would expect it
to be a bit smarter about detecting the number of swaps/pages/faults of
its children and taking into account the machine's total ram before
forking off new processes.

The difficulty I see is that the make process can't tell anything
about the memory requirements of the pipeline it is about to spawn.
As a rule of thumb, C++ needs more memory than C but that depends
on what is being compiled - I have a machine-generated C program that
makes gcc bloat to ~12GB.
Ah, but make(1) can delay spawning any new processes when it knows its 
children are paging.


This is sort of like well you can't predict when an elevator will 
plunge to its doom.


...but you can stop loading hapless people onto it when it starts 
creaking... (paging/swapping).








Any ideas?  I mean a really simple algorithm could be devised that would
be better than what we appear to have (which is nothing).

If you can afford to waste CPU, one approach would be for make(1) to
setrlimit(2) child processes and if the child dies, it retries that
child by itself - but that will generate unnecessary retries.

This doesn't really help.



Another, more involved, approach would be for the scheduler to manage
groups of processes - if a group of processes is causing memory
pressure as a whole then the scheduler just stops scheduling some of
them until the pressure reduces (effectively swap them out).  (Yes,
that's vague and lots of hand-waving that might not be realisable).


I think that could be done, this is actually a very interesting idea.

Another idea is for make(1) to start to kill -STOP a child when it 
detects a lot of child paging until other independent children complete 
running, which is basically what I do manually when my build explodes 
until it gets past some C++ bits.


*ugh*

-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: make -jN buildworld on 512MB ram

2012-10-31 Thread Alfred Perlstein

On 10/31/12 3:14 PM, Peter Jeremy wrote:

On 2012-Oct-31 14:21:51 -0700, Alfred Perlstein bri...@mu.org wrote:

Ah, but make(1) can delay spawning any new processes when it knows its
children are paging.

That could work in some cases and may be worth implementing.  Where it
won't work is when make(1) initially hits a parallelisable block of
big programs after a series of short, small tasks: System is OK so
the first big program is spawned.  ~100msec later, the next small task
finishes.  System in still OK (because the first big task is still
growing and hasn't achieved peak bloat[*]) so it spawns another big task.
Repeat a few times and you have a collection of big processes starting
to thrash the system.


True, but the idea is to somewhat mitigate it, not solve it completely.

So sure, you might thrash for a while, but I've seen buildworld thrash 
for HOURS not making any progress, so even if it thrashes for a bit that 
is a big win over endless thrashing.



Another, more involved, approach would be for the scheduler to manage
groups of processes - if a group of processes is causing memory
pressure as a whole then the scheduler just stops scheduling some of
them until the pressure reduces (effectively swap them out).  (Yes,
that's vague and lots of hand-waving that might not be realisable).


I think that could be done, this is actually a very interesting idea.

Another idea is for make(1) to start to kill -STOP a child when it
detects a lot of child paging until other independent children complete
running, which is basically what I do manually when my build explodes
until it gets past some C++ bits.

This is roughly a userland variant of the scheduler change above.  The
downside is that make(1) can no longer just wait(2) for a process to
exit and then decide what to do next.  Instead, it needs to poll the
system's paging activity and take action on one of its children.  Some
of the special cases it needs ta handle are:
1) The offending process isn't a direct child but a more distant
descendent - this will be the typical case: make(1) starts gcc(1)
which spawns cc1plus which bloats.
2) Multiple (potentially independent) make(1) processes all detect that
the system is too busy and stop their children.  Soon after, the
system is free so they all SIGCONT their children.  Repeat.  (Note
that any scheduler changes also need to cope with this).

[*] Typical cc1/cc1plus behaviour is to steadily grow as the input is
 processed.  At higher optimisation levels, parse trees are not
 freed at the end of a function to allow global inlining and
 optimisation.


Sure, these are obstacles, but I do not think they are insurmountable.

1 can be addressed by walking the process tree.
2 can be addressed by simply setting an environment flag that denotes 
the MASTER process, so that subchildren do not try to schedule as well.


-Alfred




___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Threaded 6.4 code compiled under 9.0 uses a lot more memory?..

2012-10-30 Thread Alfred Perlstein

Some suggestions here, jemalloc, kernel threads are good ones.

Another issue may just be some change for default thread stack size.  
This would explain why the RESIDENT set is the same, but the VIRTUAL grew.


-Alfred

On 10/30/12 9:56 AM, Karl Pielorz wrote:



--On 30 October 2012 19:43 +0700 Erich Dollansky 
erichfreebsdl...@ovitrap.com wrote:



Depends how you mean 'the same' - on the 6.4 system it shows:

   cc (GCC) 3.4.6 [FreeBSD] 20060305

And, on the 9.0-S it shows:

   cc (GCC) 4.2.1 20070831 patched [FreeBSD]

So 'same' - but different versions.


did you check the default data sizes?


How do you mean?


Now they've been running for an hour or so - they've gotten a little
larger 552M/154M and 703M/75M.

If it's not harmful I can live with it - it was just a bit of a
surprise.


And a reason to spend more money on memory. Knowing the real reason
would be better.

I can understand your surprise.


Hehe, more 'concern' than surprise I guess now...

The sendmail milter has grown to a SIZE/RES of 1045M / 454M under 9.0. 
The original 6.4 machine under heaver load (more connections) shows a 
SIZE/RES of 85M/52M.


The TCP listener code is now showing a SIZE/REZ of 815M/80M under 9.0 
with the original 6.4 box showing 44M/9.5M


The 9.0 box says it has 185M active, 472M inactive, 693M wired, 543M 
buf, and 4554M free.


At this stage I'm just a bit concerned that at least the milter code 
is going to grow, and grow - and die.


I would think it would last over night so I'll see what the figures 
are in the morning.


Thanks for the replies...

-Karl
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to 
freebsd-hackers-unsubscr...@freebsd.org


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


capping memory usage of buildworld (c++)

2012-03-25 Thread Alfred Perlstein
I have a few vms with only 768MB to 1GB of ram.  The problem is
that buildworld is slow unless I give make(1) a jobs arg of about
8.  However now when it reaches the c++ part of the build, it starts
to page like crazy:

Mem: 495M Active, 47M Inact, 162M Wired, 20M Cache, 85M Buf, 1624K Free
Swap: 2048M Total, 1527M Used, 521M Free, 74% Inuse, 4272K In, 280K Out

  PID USERNAMETHR PRI NICE   SIZERES STATE   C   TIME   WCPU COMMAND
44453 root  1  210   225M 71704K swread  1   2:14  2.98% cc1plus
4 root  1  210   225M 68056K swread  1   2:18  1.95% cc1plus
44451 root  1  210   225M 72564K swread  0   2:17  1.95% cc1plus
44459 root  1  210   221M 66824K swread  1   2:13  1.95% cc1plus
2 root  1  210   225M 67908K swread  1   2:12  1.95% cc1plus
44577 alfred1  260 16716K  1812K CPU11   0:00  1.95% top
6 root  1  210   225M 75148K swread  1   2:14  0.98% cc1plus
0 root  1  210   225M 65184K swread  1   2:13  0.98% cc1plus
44435 root  1  210   225M 65084K swread  1   2:10  0.98% cc1plus

Is there any way to cap the jobs when forking off c++?

Would people be opposed to a hack where optionally one could fence
off the jobs for C++ programs during buildworld?  meaning maybe
set .NOTPARALLEL via some other option?

It's kind of insane how much memory the compiler uses these days.

Any other options?

-- 
- Alfred Perlstein
.- VMOA #5191, 03 vmax, 92 gs500, 85 ch250, 07 zx10
.- FreeBSD committer
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Examining the VM splay tree effectiveness

2010-09-30 Thread Alfred Perlstein
 is a much better, if not the optimal,
 fit for the vmmap and page table.  RB trees are balanced binary trees
 and O(log n) in all cases.  The big advantage in this context is that
 lookups are pure reads and do not cause CPU cache invalidations on
 other CPU's and always only require a read lock without the worst
 case properties of the unbalanced splay tree.  The high cache locality
 of the vmmap lookups can be used with the RB tree as well by simply
 adding a pointer to the least recently found node.  To prevent write 
 locking this can be done lazily.  More profiling is required to make
 a non-speculative statement on this though.  In addition a few of
 the additional linked lists that currently accompany the vmmap and
 page structures are no longer necessary as they easily can be done
 with standard RB tree accessors.  Our standard userspace jemalloc
 also uses RB trees for its internal housekeeping.  RB tree details:
  http://en.wikipedia.org/wiki/Red-black_tree
 
 I say hypothesis because I haven't measured the difference to an
 RB tree implementation yet.  I've hacked up a crude and somewhat
 mechanical patch to convert the vmmap and page VM structures to
 use RB trees, the vmmap part is not stable yet.  The page part
 seems to work fine though.
 
 This is what I've hacked together so far:
  http://people.freebsd.org/~andre/vmmap_vmpage_stats-20100930.diff
  http://people.freebsd.org/~andre/vmmap_vmpage_rbtree-20100930.diff
 
 The diffs are still in their early stages and do not make use of
 any code simplifications becoming possible by using RB trees instead
 of the current splay trees.
 
 Comments on the VM issue and splay vs. RB tree hypothesis welcome.
 
 -- 
 Andre
 ___
 freebsd-hackers@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

-- 
- Alfred Perlstein
.- VMOA #5191, 03 vmax, 92 gs500, 85 ch250, 07 zx10
.- FreeBSD committer
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: What is the exected behavior with the NMI button?

2010-09-08 Thread Alfred Perlstein
* Sean Bruno sean...@yahoo-inc.com [100625 07:18] wrote:
 While trying to get a deadlock sorted out in the GPROF code, I attempted
 to use this fancy shmancy NMI button on my Dell server.
 
 I noted that, not unlike the goggles, it did nothing once the system was
 deadlocked.  I noted that when the system was running normally, an NMI
 log message would be spewed to the console.
 
 What is supposed to happen in these two cases when we toggle the NMI
 button?

If you have DDB in kernel and
machdep.panic_on_nmi: 1
machdep.kdb_on_nmi: 1
are set, you should get debugger.

-Alfred



 
 Sean
 
 ___
 freebsd-hackers@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

-- 
- Alfred Perlstein
.- AMA, VMOA #5191, 03 vmax, 92 gs500, 85 ch250, 07 zx10
.- FreeBSD committer
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Efficient way to determine when a child process forks or calls exec

2010-05-19 Thread Alfred Perlstein
* Dan McNulty dkmcnu...@gmail.com [100519 07:13] wrote:
 Thanks for all the great suggestions!
 
 It looks like the kevent system call is the closest to what I need.
 However, I didn't mention this, but I would like the process being
 traced to be stopped on entrance to fork, exec, etc. This would be
 similar to Linux's ptrace interface which sends a SIGTRAP to the
 traced process on exec, fork, etc. From what I could tell so far,
 kevent doesn't provide this functionality.
 
 Am I missing something? Is there a way to get kevent to stop the
 process when events occur?

Not that I know of off the top of my head.

Although if you want to contrib the code I can help get it in. :)

-Alfred


 
 Thanks again for your help,
 -Dan
 
 On Tue, May 18, 2010 at 2:40 AM, Alfred Perlstein alf...@freebsd.org wrote:
  * Dan McNulty dkmcnu...@gmail.com [100517 08:02] wrote:
  Hi all,
 
  I have been experimenting with ptrace to determine when a child
  process forks or calls exec. Particularly, I have explored tracing
  every system call entry and exit similar to what the truss utility
  does, and for my case, the performance impact of tracing every system
  call is too great.
 
  Is there a more efficient way than tracing every system call entry and
  exit to determine when a child process forks, calls exec, or creates a
  new LWP?
 
  Thanks a lot for your help!
 
  kevent has some hooks, have you looked at that?
 
  --
  - Alfred Perlstein
  .- AMA, VMOA #5191, 03 vmax, 92 gs500, 85 ch250, 07 zx10
  .- FreeBSD committer
 

-- 
- Alfred Perlstein
.- AMA, VMOA #5191, 03 vmax, 92 gs500, 85 ch250, 07 zx10
.- FreeBSD committer
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Efficient way to determine when a child process forks or calls exec

2010-05-18 Thread Alfred Perlstein
* Dan McNulty dkmcnu...@gmail.com [100517 08:02] wrote:
 Hi all,
 
 I have been experimenting with ptrace to determine when a child
 process forks or calls exec. Particularly, I have explored tracing
 every system call entry and exit similar to what the truss utility
 does, and for my case, the performance impact of tracing every system
 call is too great.
 
 Is there a more efficient way than tracing every system call entry and
 exit to determine when a child process forks, calls exec, or creates a
 new LWP?
 
 Thanks a lot for your help!

kevent has some hooks, have you looked at that?

-- 
- Alfred Perlstein
.- AMA, VMOA #5191, 03 vmax, 92 gs500, 85 ch250, 07 zx10
.- FreeBSD committer
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Coverity warning: strncpy(cpi-dev_name, cam_sim_name(sim), DEV_IDLEN);

2010-05-01 Thread Alfred Perlstein
I notice this code sprinkled through the sources:
  strncpy(cpi-dev_name, cam_sim_name(sim), DEV_IDLEN);

This trips up coverity because it does not know for sure
that the string returned by cam_sim_name() is going to 
be DEV_IDLEN-1 characters long.

Should we switch these calls to strlcpy?  Is there a smarter
thing to do to code more defensively?

thank you,
-- 
- Alfred Perlstein
.- AMA, VMOA #5191, 03 vmax, 92 gs500, 85 ch250, 07 zx10
.- FreeBSD committer
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: fixes for enhanced coredump

2010-04-28 Thread Alfred Perlstein
Additionally I need to remove all traces of IO_NODELOCKED
from kern_gzio.c as they leads to unlocked vnode access
otherwise in the gzip coredump routine.

A review would be much appreciated.

thank you,
-Alfred

* Alfred Perlstein alf...@freebsd.org [100428 10:18] wrote:
 I was recently working on the enhanced coredumps
 internal to Juniper and realized that there were
 some defects in the code I pushed (mostly due to
 mismerge), can someone please review?
 
 1) don't allocate hostname[] on the stack 
 2) don't leak the temp buffer in imgact_elf_coredump.
 
 thank you,
 -Alfred
 
 
 Index: kern/kern_sig.c
 ===
 --- kern/kern_sig.c   (revision 207329)
 +++ kern/kern_sig.c   (working copy)
 @@ -3004,8 +3004,9 @@
   char *temp;
   size_t i;
   int indexpos;
 - char hostname[MAXHOSTNAMELEN];
 + char *hostname;
   
 + hostname = NULL;
   format = corefilename;
   temp = malloc(MAXPATHLEN, M_TEMP, M_NOWAIT | M_ZERO);
   if (temp == NULL)
 @@ -3021,6 +3022,19 @@
   sbuf_putc(sb, '%');
   break;
   case 'H':   /* hostname */
 + if (hostname == NULL) {
 + hostname = malloc(MAXHOSTNAMELEN,
 + M_TEMP, M_NOWAIT);
 + if (hostname == NULL) {
 + log(LOG_ERR,
 + pid %ld (%s), uid (%lu): 
 + unable to alloc memory 
 + for corefile hostname\n,
 + (long)pid, name,
 + (u_long)uid);
 +goto nomem;
 +}
 +}
   getcredhostname(td-td_ucred, hostname,
   sizeof(hostname));
   sbuf_printf(sb, %s, hostname);
 @@ -3054,9 +3068,10 @@
   }
  #endif
   if (sbuf_overflowed(sb)) {
 - sbuf_delete(sb);
   log(LOG_ERR, pid %ld (%s), uid (%lu): corename is too 
   long\n, (long)pid, name, (u_long)uid);
 +nomem:
 + sbuf_delete(sb);
   free(temp, M_TEMP);
   return (NULL);
   }
 Index: kern/imgact_elf.c
 ===
 --- kern/imgact_elf.c (revision 207329)
 +++ kern/imgact_elf.c (working copy)
 @@ -1088,8 +1088,10 @@
   hdrsize = 0;
   __elfN(puthdr)(td, (void *)NULL, hdrsize, seginfo.count);
  
 - if (hdrsize + seginfo.size = limit)
 - return (EFAULT);
 + if (hdrsize + seginfo.size = limit) {
 + error = EFAULT;
 + goto done;
 + }
  
   /*
* Allocate memory for building the header, fill it up,
 @@ -1097,7 +1099,8 @@
*/
   hdr = malloc(hdrsize, M_TEMP, M_WAITOK);
   if (hdr == NULL) {
 - return (EINVAL);
 + error = EINVAL;
 + goto done;
   }
   error = __elfN(corehdr)(td, vp, cred, seginfo.count, hdr, hdrsize,
   gzfile);
 @@ -1125,8 +1128,8 @@
   curproc-p_comm, error);
   }
  
 +done:
  #ifdef COMPRESS_USER_CORES
 -done:
   if (core_buf)
   free(core_buf, M_TEMP);
   if (gzfile)
 -- 
 - Alfred Perlstein
 .- AMA, VMOA #5191, 03 vmax, 92 gs500, 85 ch250, 07 zx10
 .- FreeBSD committer

-- 
- Alfred Perlstein
.- AMA, VMOA #5191, 03 vmax, 92 gs500, 85 ch250, 07 zx10
.- FreeBSD committer
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: grep

2010-04-05 Thread Alfred Perlstein
* Gabor Kovesdan ga...@kovesdan.org [100330 08:52] wrote:
 On 30/03/2010 20:00, Mark nesterovych wrote:
 Hi all.
 
 Decided to write BSD licensed grep and provide it to FreeBSD project if
 success.

 
 Dear Mark,
 
 this project is already completed and is going to be integrated to the 
 base system once portmgr can run an experimental build to make sure it 
 introduces no regressions. I suggest that you consider working on either 
 diff/sdiff or you can contribute to my sort implementation, which is not 
 totally completed yet.

Hello,

Where is diff/sdiff projects?

thank you,
-- 
- Alfred Perlstein
.- AMA, VMOA #5191, 03 vmax, 92 gs500, 85 ch250
.- FreeBSD committer
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Unique process id (not pid) and accounting daemon

2010-01-30 Thread Alfred Perlstein
* cronfy cro...@gmail.com [100128 06:16] wrote:
  To ensure that process in the process tree and process in the
  accounting file are the same, I want to add unique process identifier
  (uint64_t) to 'proc' struct in sys/sys/proc.h and increment it for
  every process fork. I see it is possible to do this just before
  sx_sunlock() in fork1() in sys/kern/kern_fork.c.
  Now that I know this, I would suggest simply recording the start
  time as the serial number, then using pid+recorded_start_time as
  your serial number.
 
 This may lead to duplicate ids: pid may be reused and time may be
 shifted to give exactly the same start_time as it was used with this
 pid earlier. Simple increment will work fine.

You're right.  I was still stuck in start time doesn't change mode.
(assuming that if the time changed that the start time used to
tag processes would be incremented akin to your increasing number).

meh :)

 Ok, as far as no one else commented at my idea, I assume it is not
 completely stupid and will try to implement this :)

It sounds good! 

-- 
- Alfred Perlstein
.- AMA, VMOA #5191, 03 vmax, 92 gs500, 85 ch250
.- FreeBSD committer
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Unique process id (not pid) and accounting daemon

2010-01-24 Thread Alfred Perlstein
* cronfy cro...@gmail.com [100124 15:59] wrote:
 Hello.
 
 Sorry for the crosspost, I intended to post this to freebsd-hackers@, but
 sent first copy to freebsd-questions@ by mistake.
 
 
 I am trying to create an accounting daemon that would be more precise
 than usual BSD system accounting. It should read the whole process
 tree from time to time (say, every 10 seconds) and log changes in
 usage of CPU, I/O operations and memory per process. After daemon
 notices process exit, it should read /var/account/acct to get a last
 portion of accounting data and make a last entry for the process. Also
 daemon should read /var/account/acct to find information about
 processes that had been running between taking process tree snapshots.
 
 There is a problem: it is not always possible to link a process in a
 process tree against matching process in an accounting file. Only
 command name, user/group id  and start time will match, but:
 
  * start time may change (i. e. after ntpdate);
  * command name saved in /var/account/acct is 15 characters max
 (AC_COMM_LEN in sys/sys/acct.h), while command name in the process
 tree is 19 characters max (MAXCOMLEN in sys/sys/param.h).
 
 To ensure that process in the process tree and process in the
 accounting file are the same, I want to add unique process identifier
 (uint64_t) to 'proc' struct in sys/sys/proc.h and increment it for
 every process fork. I see it is possible to do this just before
 sx_sunlock() in fork1() in sys/kern/kern_fork.c. I'll have to add
 saving of this identifier in kern_acct.c, of course.
 
 This way I will be extremely easy to remember a process in the process
 tree and find a matching one in the accounting file after it finishes.
 
 Am I looking in a right direction or should I try some other way?
 Thanks in advance.

I've thought of this a few times, specifically how to ensure
not sending a signal to a process by accident, specifically adding
a version of kill(2) that took process start time.

It's interesting that you bring up that start time can change, I 
did not know this.

Now that I know this, I would suggest simply recording the start
time as the serial number, then using pid+recorded_start_time as
your serial number.

Just an idea.

-- 
- Alfred Perlstein
.- AMA, VMOA #5191, 03 vmax, 92 gs500, 85 ch250
.- FreeBSD committer
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: global TCP_NODELAY?

2009-10-12 Thread Alfred Perlstein
* Ivan Voras ivo...@freebsd.org [091012 04:29] wrote:
 I'm trying to work around some extreme brain damageness in PHP (yes, it 
 sucks) which doesn't have a way to set TCP_NODELAY on stream sockets so 
 I'm wondering what are my other options? Is there a way to set 
 TCP_NODELAY system-wide?

Ivan, many people write php extensions, maybe you can do that?


-- 
- Alfred Perlstein
.- AMA, VMOA #5191, 03 vmax, 92 gs500, 85 ch250
.- FreeBSD committer
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: fcntl(F_RDAHEAD)

2009-09-17 Thread Alfred Perlstein
  0x0002
  #endif
 +#endif
  
  /* Defined by POSIX Extended API Set Part 2 */
  #if __BSD_VISIBLE
 @@ -218,6 +222,7 @@
  #define  F_SETLK 12  /* set record locking 
 information */
  #define  F_SETLKW13  /* F_SETLK; wait if blocked */
  #define  F_SETLK_REMOTE  14  /* debugging support for remote 
 locks */
 +#define  F_READAHEAD 15  /* read ahead */
  
  /* file descriptor flags (F_GETFD, F_SETFD) */
  #define  FD_CLOEXEC  1   /* close-on-exec flag */

 ___
 freebsd-hackers@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

-- 
- Alfred Perlstein
.- AMA, VMOA #5191, 03 vmax, 92 gs500, 85 ch250
.- FreeBSD committer
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


script(1) issue/question

2009-09-16 Thread Alfred Perlstein
[[ peter cc'd cause he seemed to add the original
   exec a non-shell option to script(1) ]]

Hello all,

I noticed that when running script and passing a program
to exec that ^Z does not seem to work (although ^C does).

I'm trying to figure a workaround and what I was going to
do was add ISIG to the term flags when spawning a non-shell
utility.

(should I also check /etc/shells to help preserve POLA
further?)

Any pointers on this?

Would this be a good idea, or a bad idea?  Terminal gurus
give me a hand please! :)

please ignore the sigflg part at the top for now, prepping
for possible cli option to avoid POLA breakage.

Is there a way to detect ^Z or other terminal signals and propogate
them to the child in a better way?

-- 
- Alfred Perlstein
.- AMA, VMOA #5191, 03 vmax, 92 gs500, 85 ch250
.- FreeBSD committer
Index: script.c
===
--- script.c	(revision 195826)
+++ script.c	(working copy)
@@ -68,6 +68,7 @@
 int	child;
 const char *fname;
 int	qflg, ttyflg;
+int	sigflg;
 
 struct	termios tt;
 
@@ -104,6 +105,9 @@
 		case 'k':
 			kflg = 1;
 			break;
+		case 'S':
+			sigflg = 1;
+			break;
 		case 't':
 			flushtime = atoi(optarg);
 			if (flushtime  0)
@@ -241,11 +245,20 @@
 doshell(char **av)
 {
 	const char *shell;
+	struct termios rtt;
 
 	shell = getenv(SHELL);
 	if (shell == NULL)
 		shell = _PATH_BSHELL;
 
+	if (av[0]) {
+		/* enable signals for non-shell programs */
+		rtt = tt;
+		cfmakeraw(rtt);
+		rtt.c_lflag = ~ECHO;
+		rtt.c_lflag |= ISIG;
+		(void)tcsetattr(STDIN_FILENO, TCSAFLUSH, rtt);
+	}
 	(void)close(master);
 	(void)fclose(fscript);
 	login_tty(slave);
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: memchr() strangeness

2009-09-04 Thread Alfred Perlstein
Moved to -hackers.

Gabor, can you please make a smaller program to exhibit this behavior?
(not just the error line)

I will be glad to help out.

-Alfred

* Gabor Kovesdan ga...@freebsd.org [090904 10:04] wrote:
 Hello,
 
 having returned from vacation, I'm trying to track down the (hopefully) 
 last critical bug in BSDL grep I worked on the last summer. The binary 
 file detection is implemented as follows:
 f-binary = memchr(binbuf, (filebehave != FILE_GZIP) ? '\0' : '\200', i 
 - 1) != NULL;
 
 There's some strange with this. In my normal environment it works fine:
 server# echo foobar | ./grep -v '^ *+'
 foobar
 
 But in a chroot environment the binary detection is broken:
 # echo foobar | grep -v '^ *+'
 foobar
 Binary file (standard input) matches
 
 I don't know where things go bad. I've tried to print out the content of 
 the buffer and the buffer length and they are the same but somehow in 
 the chrooted environment this sets f-binary to true.
 Any suggestions?
 
 Thanks in advance,
 
 -- 
 Gabor Kovesdan
 FreeBSD Volunteer
 
 EMAIL: ga...@freebsd.org .:|:. ga...@kovesdan.org
 WEB:   http://people.FreeBSD.org/~gabor .:|:. http://kovesdan.org
 
 -- 
 This mail is for the internal use of the FreeBSD project committers,
 and as such is private. This mail may not be published or forwarded
 outside the FreeBSD committers' group or disclosed to other unauthorised
 parties without the explicit permission of the author(s).

-- 
- Alfred Perlstein
.- AMA, VMOA #5191, 03 vmax, 92 gs500, 85 ch250
.- FreeBSD committer
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Common interface for sensors/health monitoring

2009-08-22 Thread Alfred Perlstein
* Alexander Leidinger alexan...@leidinger.net [090822 10:44] wrote:
 On Sat, 22 Aug 2009 00:04:10 -0700 Julian Elischer
 jul...@elischer.org wrote:
 
  The purists won out in that one by shouting loudly and screaming
  about socialized healthware. Consequently we have 47 million
  unsupported devices.
 
 You forgot to tell that now nobody wants to touch this subject anymore,
 as he may be the target of similar shouting then.

I say good riddence, if someone wants thier hardware not to melt
then each machine should be personally responsible and enroll in
a private monitoring service we don't need project sponsored health
monitoring.

(ron paul!)

-- 
- Alfred Perlstein
.- AMA, VMOA #5191, 03 vmax, 92 gs500, 85 ch250
.- FreeBSD committer
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Spot the error

2009-08-05 Thread Alfred Perlstein
* Dimitry Andric dimi...@andric.com [090805 06:51] wrote:
 On 2009-08-05 02:15, Mel Flynn wrote:
  I would expect Unable to load fs:  + ENOENT. I was asking if this was 
  fixable, cause it looked like the code has been abstracted to the point 
  that 
  specific errors were hard, but maybe I missed something.
 
 It does not seem easily fixable.  The problem is that the mount command
 simply prints out the error of the system call, and doesn't have any
 idea which of the (many) parameters was wrong.
 
 You could change the returned error in this particular case to ENOENT,
 of course, but that might be considered even more confusing.  Like,
 What do you mean, that SCSI disk doesn't exist? It's right there in
 /dev!
 
 One could also argue for EINVAL, but there's the bikeshed again... :)

mount 9 could be augmented to preflight/postflight the vfs type name
through the provide a better error.


-- 
- Alfred Perlstein
.- AMA, VMOA #5191, 03 vmax, 92 gs500, 85 ch250
.- FreeBSD committer
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: distributed scm+freebsd svn?

2009-08-04 Thread Alfred Perlstein
This is amazing!  Have you considered an article for publication based
on your experiences?  Perhaps a handbook article?

This will be very helpful going forward.  

(Although, I do kind of wish it was a instructions fits on
the back of a napkin kind of operation.) :)

thanks again!
-Alfred

* Giorgos Keramidas keram...@freebsd.org [090802 18:32] wrote:
 On Sun, 26 Jul 2009 16:15:34 -0700, Alfred Perlstein alf...@freebsd.org 
 wrote:
  Hello hackers,
 
  Does anyone here use one of the distributed SCMs to manage
  contributions to FreeBSD in an easy manner?
 
 Hi Alfred,
 Yes, I do that.
 
  Any pointers to a setup you have?
 
  I thought git was supposed to make this easy, but going over the
  docs leaves me with a lot of questions.
 
 Git is a wonderful system but it's UI and documentation often make me
 want to scream bad things.  My own suggestion is to go with Mercurial,
 because it's command set looks a *lot* like CVS or Subversion, it's often
 as fast or even faster than Git, and it doesn't seem as 'confusing' as Git.
 
 More details below...
 
  I'm hoping to be able to basically:
sync into my distributed repo.
allow a third party access to it.
easily commit upstream back into svn from a branch
  in my distributed scm.
 
 I use a local Mercurial repository for my own patches.  It seems to
 support most of the things I want to do, i.e.:
 
   * Keep a clean `/hg/bsd/head' workspace and pull full changesets into
 that from our svn repository
 
   * Support incremental updates of `/hg/bsd/head'.
 
   * Easily clone my `/hg/bsd/head' to one or more `feature' branches.
 
   * Allow others to pull from `head' as a read-only source over http or
 ssh.
 
 The /head branch has a huge history that I don't really want to keep
 around in every clone.  So I started my conversion from 2007-12-31 and I
 keep updating it with the `hg convert ...' command wrapped in a small
 shell script:
 
 $ cat -n /hg/bsd/pull-head.sh
  1  #!/bin/sh
  2
  3  set -e
  4  hg convert \
  5  --config convert.svn.startrev='175021' \
  6  --config convert.svn.trunk='head' \
  7  --config convert.svn.branches='' \
  8  --config convert.svn.tags='' \
  9  file:///home/svn/base/ \
 10  /hg/bsd/head
 
 You can use the webdav http://svn.freebsd.org/base/ or an SSH tunneled
 URI to access to Subversion repository, but I keep a local mirror of the
 Subversion repository too, so I prefer that.
 
 
 Typical Mercurial-based Workflow
 
 
   1. Pull subversion commits into the 'head' workspace.
 
   2. Pull these changes from 'head' to my working tree.
 
   3. Merge the changes with the local patches of the working tree.
 
   4. Extract one or more patches for committing to Subversion
 
   5. Rinse, leather, repeat...
 
 Pulling the latest commits from Subversion
 --
 
 The first step is the easiest bit.  I just run `/hg/bsd/pull-head.sh'.
 
 This requires an installed copy of the Python bindings of Subversion
 [devel/py-subversion] and the `convert' extension enabled in my ~/.hgrc
 file with:
 
 [extensions]
 convert =
 
 A sample run of `pull-head.sh' looks like this:
 
 keram...@kobe:/hg/bsd$ time ./pull-head.sh
 scanning source...
 sorting...
 converting...
 1 Many network stack subsystems use a single global data structure to hold
 0 Add padding to struct inpcb, missed during our padding sweep earlier in
 3.306 real  1.809 user  0.619 sys
 keram...@kobe:/hg/bsd$
 
 This is reasonably fast, but it does come with an important caveat.
 It's not terribly important for my own work, but it *may* be for yours:
 
 The Python bindings of Subversion do not support svn:keywords, so
 all our manually configured '$FreeBSD$' stuff is unexpanded in the
 converted tree.  Mergemaster may cause various levels of fun and
 amusement if you mix, match and alternate between svn-based and
 mercurial-based workspaces often!
 
 At this point, after pull-head.sh has finished running, the most
 recent commit in the head/.hg/ workspace state is the last commit by
 rwatson:
 
 keram...@kobe:/hg/bsd/head$ hg log --limit 1
 changeset:   12589:8ce7c7a0b804
 branch:  head
 tag: tip
 user:rwatson
 date:Sun Aug 02 22:47:08 2009 +
 summary: Add padding to struct inpcb, missed during our padding sweep 
 earlier in
 
 keram...@kobe:/hg/bsd/head$
 
 This clone/workspace is my 'clean' slate, and it only contains an `.hg'
 data store.  No checkout or other workspace contents:
 
 keram...@kobe:/hg/bsd/head$ ls -la
 total 6
 drwxr-xr-x  3 keramida  users  - 512 Nov 10  2008 .
 drwxr-xr-x  8 keramida  users  - 512 Aug  3 02:36 ..
 drwxr-xr-x  3 keramida  users  - 512 Aug  3 02:36 .hg
 keram

distributed scm+freebsd svn?

2009-07-26 Thread Alfred Perlstein
Hello hackers,

Does anyone here use one of the distributed SCMs
to manage contributions to FreeBSD in an easy
manner?

Any pointers to a setup you have?

I thought git was supposed to make this easy, but
going over the docs leaves me with a lot of questions.

I'm hoping to be able to basically:
  sync into my distributed repo.
  allow a third party access to it.
  easily commit upstream back into svn from a branch
in my distributed scm.




-- 
- Alfred Perlstein
VMOA #5191, 03 vmax, 92 gs500, ch250 - FreeBSD
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: c question: *printf'ing arrays

2009-06-30 Thread Alfred Perlstein

-- 
- Alfred Perlstein
VMOA #5191, 03 vmax, 92 gs500, ch250 - FreeBSD
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?

2009-05-29 Thread Alfred Perlstein
* Dag-Erling Sm??rgrav d...@des.no [090529 02:49] wrote:
 Alfred Perlstein alf...@freebsd.org writes:
  Dag-Erling Sm??rgrav d...@des.no writes:
   Usually, what you see is closer to this:
   
   if ((pid = fork()) == 0) {
   for (int fd = 3; fd  getdtablesize(); ++fd)
   (void)close(fd);
   execve(path, argv, envp);
   _exit(1);
   }
 
  I'm probably missing something, but couldn't you iterate 
  in the parent setting the close-on-exec flag then vfork?
 
 This is an example, Alfred.  Like most examples, it is greatly
 simplified.  I invite you to peruse the source to find real-world
 instances of non-trivial fork() / execve() usage.

It wasn't meant to critisize, just ask a question for the specific
instance because it made me curious.  I know how bad it can be with
vfork as I observed a few fixes involving mistaken use of vfork at
another job.

So yes, there's more than one way to skin a cat for this particular
example... but in practice using vfork()+exec() is hard to get right?

-- 
- Alfred Perlstein
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?

2009-05-28 Thread Alfred Perlstein
* Dag-Erling Sm??rgrav d...@des.no [090527 06:10] wrote:
 Yuri y...@rawbw.com writes:
  I don't have strong opinion for or against memory overcommit. But I
  can imagine one could argue that fork with intent of exec is a faulty
  scenario that is a relict from the past. It can be replaced by some
  atomic method that would spawn the child without ovecommitting.
 
 You will very rarely see something like this:
 
 if ((pid = fork()) == 0) {
 execve(path, argv, envp);
 _exit(1);
 }
 
 Usually, what you see is closer to this:
 
 if ((pid = fork()) == 0) {
 for (int fd = 3; fd  getdtablesize(); ++fd)
 (void)close(fd);
 execve(path, argv, envp);
 _exit(1);
 }

I'm probably missing something, but couldn't you iterate 
in the parent setting the close-on-exec flag then vfork?

I guess that wouldn't work for threads AND you'd have to
undo it after the fork if you didn't want to retain that
behavior?

thanks,
-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: porting info for FreeBSD's kernel?

2009-05-23 Thread Alfred Perlstein
* Chuck Robey chu...@telenix.org [090522 07:09] wrote:
 Alfred Perlstein wrote:
  I wouldn't sweat the compiler as much as the actual OS code, I think
  it should be relatively easy to trick the build to use an external
  compiler (ie, don't get caught up in the compiler bootstrap quagmire,
  leave that for later...)
  
  Anyhow, you're talking to someone that has studied, but not implemented
  a port, so take my advice with a few heaps of salt. :)
  
  Typically what people focus on is:
  
  1) how am I going to get the first line of dmesg to come up
  2) how am I going to get to single user mode
  3) multi user?
  4) cleanup of compiler and bootstrap issues.
  
  If you get sidetracked by #4, you can spend months doing that
  instead of just rolling with it when you get there.
 
 
 I'll admit it's not terribly hard to just get a foreign compiler to work, and
 I've already gotten a version of gcc-4.3.1 jiggered.  I was going to 
 concentrate
 next on cleaning up the compiler issue, which is why I wanted to get a
 pronouncement  on which way to go.  If I simply try to duck as much of that
 issue as possible, I can use the gcc-4.3.1 without huge problems.  I can see
 that fine ,,, BUT the next part, getting ghe booting working, that does seem 
 to
 be something which is necessary to do.  How could U just duck out of that the
 way I could easily do for the compiler?  I mean, how could you cause the 
 booting
 to get fooled into thinking it was working?  If you could give me an example 
 of
 any possible way to get past this issue, I'm willing to do as you request, if
 only I could recognize the action you're asking me to take.

Oh, I wasn't suggesting that you somehow fake up the loader part,
you'll have to do that too! :)

Perhaps a pre-step then should be:
0) get the loader working in some form. :)

 In the meantime (Until I understand what you're asking for) I'm rereading my
 old Dragon book, so I can begin to understand what llvm is doing.  From 
 Sandeep
 Patel, of llvm, btw, he tells me that the -A8 and -A9 work on llvm is going 
 very
 rapidly, and it may well be ready before we realize, so being able to push off
 making the compiler decision is actually maybe quite agood thing to 
 contemplate.

you can really spend forever on this, again, unless you have a pressing
need due to the compiler being completely broken, it's a bad idea
to focus on cleanliness first.

first work on getting it to boot, only stop if you hit a bug, don't
clean or you'll never finish.

again, this has only been my observation, I'm no porting OS master,
but I have observed a few ports and my suggestions are what I've
observed to have been the course of action of successful porters.

I've also observed that whenever someone gets caught up in the
details, they usually fail.

good luck,
-- 
- Alfred Perlstein
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: porting info for FreeBSD's kernel?

2009-05-22 Thread Alfred Perlstein
* Chuck Robey chu...@telenix.org [090521 14:56] wrote:
 Alfred Perlstein wrote:
  * Chuck Robey chu...@telenix.org [090518 13:03] wrote:
  -BEGIN PGP SIGNED MESSAGE-
  Hash: SHA1
 
  I've been googling, trying to see if I can find notes regarding what needs
  changing, in what order, to adapt the FreeBSD kernel to a new processor.  
  Anyone
  know where stuff like that can be found?
  
  You need a cross compile toolchain of course, look into how FreeBSD
  is configured for the various arches.
  
  Then I would suggest looking at the loaders, followed by 
  kern/init_main.c.  If you trace down init_main.c and some
  of the early sysinits that should give you an idea.
  
  You might also be able to backtrack using CVS/svn to follow
  how mips or arm was done.
  
  Note: freebsd has a decent cross-compile setup now, see
  make universe so things should be easier to get started.
  
 
 Thanks.  I will *definitely* read all the parts you hint me at, I won't be
 deleting this mail, and I appreciate it.  I was asking on the llvm maillist
 about Cortex-A8 support.  What I got says that it's not there yet, but it's
 being worked upon, that and the -A9 support (definite differences).  So, any
 crosstools needed today would have to be gcc, from a version at least as new 
 as
 the 4.3 branch (that's where they brought in the -A8 support).
 
 The tool I got by doing the freeBSD crosstools was 4.2.1, which isn't going to
 do it for the Cortex-A8, and I had someone else (from a FreeBSD list) tell me
 that bringing in a newer version of gcc wasn't extremely likely, that they'd
 want llvm instead.  I see 3 alternatives for a Cortex-A8 port: using a new gcc
 port, waiting on the upgrade of llvm, or maybe deciding that the version the
 llvm that's out now, with the v6 compatibility, would be (for the short term)
 good enough.  Any idea which one to choose?  The only one that interests me is
 for the TI OMAP 3530 (Cortex-A8, among other parts).  Maybe if the currently
 available llvm is good enough, maybe gcc-4.2.1 may creak along well enough for
 the short term?  I need to understand this.
 
 My own personal Pandora won't probably won't arrive on my doorstep for maybe 
 as
 long as 3 more months, so in the meantime, I think I will be reading all I can
 get my hands on regarding llvm.  Maybe I can really learn enough to make a
 difference.  In school, I concentrated very definitely on OSes (I've written 3
 of them over the years, of quite varying levels of performance), so for
 compilers, I'm relying on my old 1988 version of the Aho/Sethi/Ullman 
 compilers
 book.  If anyone knows a more modern book that will show me enough about
 compilers to be useful, I'd really appreciate the name, maybe Amazon will let 
 me
 get a cheap used version.

I wouldn't sweat the compiler as much as the actual OS code, I think
it should be relatively easy to trick the build to use an external
compiler (ie, don't get caught up in the compiler bootstrap quagmire,
leave that for later...)

Anyhow, you're talking to someone that has studied, but not implemented
a port, so take my advice with a few heaps of salt. :)

Typically what people focus on is:

1) how am I going to get the first line of dmesg to come up
2) how am I going to get to single user mode
3) multi user?
4) cleanup of compiler and bootstrap issues.

If you get sidetracked by #4, you can spend months doing that
instead of just rolling with it when you get there.


-- 
- Alfred Perlstein
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?

2009-05-22 Thread Alfred Perlstein
* Yuri y...@rawbw.com [090521 10:52] wrote:
 Nate Eldredge wrote:
 Suppose we run this program on a machine with just over 1 GB of 
 memory. The fork() should give the child a private copy of the 1 GB 
 buffer, by setting it to copy-on-write.  In principle, after the 
 fork(), the child might want to rewrite the buffer, which would 
 require an additional 1GB to be available for the child's copy.  So 
 under a conservative allocation policy, the kernel would have to 
 reserve that extra 1 GB at the time of the fork(). Since it can't do 
 that on our hypothetical 1+ GB machine, the fork() must fail, and the 
 program won't work.
 
 I don't have strong opinion for or against memory overcommit. But I 
 can imagine one could argue that fork with intent of exec is a faulty 
 scenario that is a relict from the past. It can be replaced by some 
 atomic method that would spawn the child without ovecommitting.

vfork, however that's not sufficient for many scenarios.

 Are there any other than fork (and mmap/sbrk) situations that would 
 overcommit?

sysv shm?  maybe more.

-- 
- Alfred Perlstein
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: porting info for FreeBSD's kernel?

2009-05-20 Thread Alfred Perlstein
* Chuck Robey chu...@telenix.org [090518 13:03] wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 I've been googling, trying to see if I can find notes regarding what needs
 changing, in what order, to adapt the FreeBSD kernel to a new processor.  
 Anyone
 know where stuff like that can be found?

You need a cross compile toolchain of course, look into how FreeBSD
is configured for the various arches.

Then I would suggest looking at the loaders, followed by 
kern/init_main.c.  If you trace down init_main.c and some
of the early sysinits that should give you an idea.

You might also be able to backtrack using CVS/svn to follow
how mips or arm was done.

Note: freebsd has a decent cross-compile setup now, see
make universe so things should be easier to get started.

-- 
- Alfred Perlstein
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


FreeBSD jobs

2009-05-15 Thread Alfred Perlstein
 On Thu, May 14, 2009 at 9:53 AM, Julian Stacey jhs at berklix.org wrote:
  Hi hackers@
  A commercial firm asked for _Free_ labour today on jobs at freebsd.
  The censors passed it.  Censors of jobs at freebsd.org then blocked
  the posting below.  jobs@ censors again bad, block wrong things,
  should all be removed  not replaced.
 
  Several suckers have already enquired to that firm.  Hope we might get
  some Free labour to donate time to Freebsd, Not Stock holders !
 
 Hi Julian,
 
 Internships are an accepted way for a high school or university
 student (and nowadays some post grad students and others) to gain a
 bit of experience in their field before joining the work force or
 perhaps while switching careers. At my company, we've filled several
 full-time positions with people that were interns first. It's just a
 way to fill a part-time, sometimes non-paid job, at a company where
 there isn't an official requisition for that particular position.
 Nobody is forcing anybody to take the internship and it is clearly
 stated that it is a non-paid internship in the post.
 
 I imagine that there would be some interested students or unemployed
 people that would love to work with Alfred on a project at Juniper a
 few hours a week in their spare time, for free. It will look great on
 a resume, they will probably learn some valuable skills, and perhaps
 parlay it into a full-time, paid position.
 
 best,
 -matt

Thanks Matt, this is my only intention.

A few of the candidates I've spoken too are very excited to get
something on their resume with a commercial entity and there is the
hope that I may be able to hire one on them in the future.

I've also promised the candidates that they will have access to
some amazing resources within Juniper if (I can manage it) and at the
very least I can mentor them on any FreeBSD endeavors they take on
for the other non-2 hours per-day they would be working for me.

While I would love to see more students working on FreeBSD, the
fact of the matter is that some students already have worked on
FreeBSD and would like commercial experience of worked on a team
in an office environment that is challenging to mimic in our
(FreeBSD's) distributed ways.

At the end of the day, what FreeBSD-jobs is supposed to be is a
place where jobs can be posted and found that will enable a FreeBSD
fan to find suitable employment opportunities for a career, or
to advance their career.

The reason for moderation of FreeBSD-jobs is to prevent people such
as Julian turning a well intentioned message into a thread of
flames because he's gone imbalanced due to lack of coffee some
morning.

Effectively it's been a pretty swell system, FreeBSD-jobs has 0
spam (except to the poor moderators) and also insulated job seekers
and posters from the typical hecklers who feel the need for extremely
abusive emails due to some real or perceived mistake by the recruiter
or job-seeker.

I honestly feel that we've even saved plenty of people embarrassment
by blocking or bouncing messages that they may have sent in haste
to freebsd-jobs that after cooling off realized the Internet is
forever, why in g-d's name did I send something so mean with my
name on it?!?.

It's a shame it doesn't work for cross-list posts. 

I'm proud to be one of the moderators on FreeBSD-jobs, but I do
admit most of the work is done by the other moderators.

Thanks again Matt.  I'm going to have to pick your brain later about
how to deal with interns, care, feeding, hats? :)

And Julian, chill out, I still cringe from embarrassment when someone
drags out some old email _I_ sent with close to the  same tone as
the ones I've been seeing from you.  Best of luck.

-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


question about dev/md/md.c out of swap?

2009-04-21 Thread Alfred Perlstein
Hello, a developer here at work asked me to go over 
some of the swapper code with him.

We came across something we both couldn't understand,
so I was wondering if anyone had looked at this.

in dev/md/md.c mdstart_swap() there is the following code,
it seems that in the case of VM_PAGER_ERROR most of the state
is unwound, however the page is not freed.  Is this a bug or
are we missing something?  How is the page released?


rv = VM_PAGER_OK;   
VM_OBJECT_LOCK(sc-object); 
vm_object_pip_add(sc-object, 1);   
for (i = bp-bio_offset / PAGE_SIZE; i = lastp; i++) { 
len = ((i == lastp) ? lastend : PAGE_SIZE) - offs;  

m = vm_page_grab(sc-object, i, 
VM_ALLOC_NORMAL|VM_ALLOC_RETRY);
VM_OBJECT_UNLOCK(sc-object);   
sched_pin();
sf = sf_buf_alloc(m, SFB_CPUPRIVATE);   
VM_OBJECT_LOCK(sc-object); 
if (bp-bio_cmd == BIO_READ) {  
if (m-valid != VM_PAGE_BITS_ALL)   
rv = vm_pager_get_pages(sc-object, m, 1, 0);  
if (rv == VM_PAGER_ERROR) { 
sf_buf_free(sf);
sched_unpin();  
vm_page_lock_queues();  
vm_page_wakeup(m);  
vm_page_unlock_queues();
break;  
}   
bcopy((void *)(sf_buf_kva(sf) + offs), p, len); 
} else if (bp-bio_cmd == BIO_WRITE) {  
if (len != PAGE_SIZE  m-valid != VM_PAGE_BITS_ALL)   
rv = vm_pager_get_pages(sc-object, m, 1, 0);  
if (rv == VM_PAGER_ERROR) { 
sf_buf_free(sf);
sched_unpin();  
vm_page_lock_queues();  
vm_page_wakeup(m);  
vm_page_unlock_queues();
break;  
}   

-- 
- Alfred Perlstein
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: 'libc_r: enter/leave_cancellation_point()

2008-10-21 Thread Alfred Perlstein
Hey Norbert, this is probably a bug, but might not be addressed
because libc_r is not really supported any longer.

Someone may pick it up, but I'm uncertain of that.

* Norbert Koch [EMAIL PROTECTED] [081021 01:32] wrote:
 Hello,
 
 I was just inspecting libc_r for trying to understand
 some things and found this:
 
 -
 
 --- src/lib/libc_r/uthread/uthread_cond.c 2002/05/24 04:32:28 1.33
 +++ src/lib/libc_r/uthread/uthread_cond.c 2002/11/13 18:13:26 1.34
 
 ...
 
  int
 -_pthread_cond_signal(pthread_cond_t * cond)
 +__pthread_cond_timedwait(pthread_cond_t *cond, pthread_mutex_t *mutex,
 +const struct timespec *abstime)
 +{
 + int ret;
 +
 + _thread_enter_cancellation_point();
 + ret = _pthread_cond_timedwait(cond, mutex, abstime);
 + _thread_enter_cancellation_point();
 + return (ret);
 +}
 
 
 
 
 Shouldn't that be _thread_leave_cancellation_point() after
 calling _pthread_cond_timedwait() ?
 What effect should I see if this is wrong?
 
 Best regards,
 
 Norbert Koch
 
 ___
 freebsd-hackers@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to [EMAIL PROTECTED]
 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-threads
 To unsubscribe, send any mail to [EMAIL PROTECTED]

-- 
- Alfred Perlstein
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [Info required] PC Architecture

2008-10-17 Thread Alfred Perlstein
* Jeroen Ruigrok van der Werven [EMAIL PROTECTED] [081016 08:06] wrote:
 -On [20081016 16:43], Srinivas ([EMAIL PROTECTED]) wrote:
 I have a theoretical understanding of the PC architecture and the
 details but have no idea of how things go under the hood(for a real
 computer).
 
 http://www.amazon.com/dp/0123706068/ - Computer Organization and Design: The
 Hardware/Software Interface by Patterson and Hennessy
 
 http://www.amazon.com/dp/0131485210/ - Structured Computer Organization by
 Tanenbaum
 
 That should answer most, if not all, of your questions on that subject.

I also REALLY like:

The 8088 Project Book
http://www.amazon.com/8088-Project-Book-Robert-Grossblatt/dp/0830631712

Although this isn't a real PC, it will give a nice start in 
case more technical stuff it too much too soon.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: sysctl text definitions.

2008-01-26 Thread Alfred Perlstein
* Dag-Erling Sm??rgrav [EMAIL PROTECTED] [080126 07:10] wrote:
 Alfred Perlstein [EMAIL PROTECTED] writes:
  Dag-Erling Sm??rgrav [EMAIL PROTECTED] writes:
  [EMAIL PROTECTED] ~% sysctl -d dev.cpu.0.temperature
  dev.cpu.0.temperature: Current temperature in degC
  lolwhat?  When did that get implemented?
 
 Twice, actually, in 1999 by myself and in 2001 by Luigi.
 
  I recall a huge storm of protest when the definitions were included in
  the kernel compile file...
 
 That was the first time, and completely unjustified as there was a knob
 to disable it (the argument was that it would bloat picobsd).

o i c. :)

 
 BTW, when are you going to join the 21st century and get a MUA that
 groks UTF-8?  :)

Civil people use the eighth bit for parity or parody, but nothing
else.

-- 
- Alfred Perlstein
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: sysctl text definitions.

2008-01-26 Thread Alfred Perlstein
* Dag-Erling Sm??rgrav [EMAIL PROTECTED] [080126 07:28] wrote:
 Alfred Perlstein [EMAIL PROTECTED] writes:
  Dag-Erling Sm??rgrav [EMAIL PROTECTED] writes:
   BTW, when are you going to join the 21st century and get a MUA that
   groks UTF-8?  :)
  Civil people use the eighth bit for parity or parody, but nothing
  else.
 
 Thank you for excluding roughly three quarters of the world's population
 from participating in the FreeBSD community under their own name.

See that's the problem, your mailer interpreted the high bit as text
instead of sarcasm.

-- 
- Alfred Perlstein
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: sysctl text definitions.

2008-01-25 Thread Alfred Perlstein
* Dag-Erling Sm??rgrav [EMAIL PROTECTED] [080125 07:58] wrote:
 Alfred Perlstein [EMAIL PROTECTED] writes:
  Hey guys, something that I've always wanted to do was actually somehow
  export those handy description strings from the kernel SYSCTL macros
  in the least obtrusive method possible.
 
 [EMAIL PROTECTED] ~% sysctl -d dev.cpu.0.temperature
 dev.cpu.0.temperature: Current temperature in degC

lolwhat?  When did that get implemented?  I recall a huge
storm of protest when the definitions were included in the
kernel compile file...

sorry for the noise.

-- 
- Alfred Perlstein
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


sysctl text definitions.

2008-01-24 Thread Alfred Perlstein
Hey guys, something that I've always wanted to do was actually somehow
export those handy description strings from the kernel SYSCTL macros
in the least obtrusive method possible.

The only thing I could come up with that didn't require compiling the
files twice was to basically do some tricks where the text strings
wound up in some throw-away section of the object files.

Any suggestions on how to do this?

In psuedo-code what I would do is something like change SYSCTL_*
and add the following:

SYSCTL_INT(, text) \
   ...old define...\
   SYSCTL_COMMENT(parent, node, text)

Also, add the following struct someplace:

struct sysctl_comment {
  const char *parent;
  const char *node;
  const char *comment;
};

Then SYSCTL_COMMENT does something like (more psuedocode):

#define SYSCTL_COMMENT(parent, node, comment) \
.set sysctl_comments { \
struct sysctl_comment uniquifier = { \
  .parent = parent; \
  .node = node; \
  .comment = comment; \
}; 


Then after building the kernel one should be able to do:
for file in kernel ${modules} ; do
  strip --section=sysctl_comments file  file.install
  objdump --section=sysctl_comment file  file.sysctl.out
  sysctl_help_database_builder file.sysctl.out  file.sysctl.db
done

Then these would be copied into /boot or maybe some other place
as part of the install process.

Sysctl or some other util could then read these db files to give
help with sysctls.

Any ideas/pointers on how to do this linker magic?

-- 
- Alfred Perlstein
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Critical Sections for userland.

2007-10-04 Thread Alfred Perlstein
* Dag-Erling Sm??rgrav [EMAIL PROTECTED] [071004 02:05] wrote:
 Alfred Perlstein [EMAIL PROTECTED] writes:
  Hi guys, we need critical sections for userland here.
 
  This is basically to avoid a process being switched out while holding
  a user level spinlock.
 
 Yeah, great idea, cooperative multitasking is the new black!

Do you have:

a) Evidence or a paper to prove that this is a bad idea?
b) A helpful suggestion?
c) An obvious understanding of the problem?

If not then perhaps one ought to restrict snarky comments
to private mail where they will be at least somewhat appreciated.

-- 
- Alfred Perlstein
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Critical Sections for userland.

2007-10-04 Thread Alfred Perlstein
* Dag-Erling Sm??rgrav [EMAIL PROTECTED] [071004 03:01] wrote:
 Alfred Perlstein [EMAIL PROTECTED] writes:
  Do you have:
 
  a) Evidence or a paper to prove that this is a bad idea?
 
 I need evidence or a paper to prove that it is a bad idea to allow a
 userland process to hold the CPU indefinitely?
 
  b) A helpful suggestion?
 
 Why don't you tell us what you're actually trying to do, so we can tell
 you how to do it.
 
  c) An obvious understanding of the problem?
 
 I'll show you mine if you show me yours.

It's not worth my time to engage someone with your mind set, you
posses neither the technical nor interpersonal skill to be useful
to me.

For context see my replies in this thread to Kip Macy which explains
how one deals with the false-problems you mention.

For evidence of existing, however suboptimal, run-to-completion
systems see the RTPRIO scheduling knobs.

-- 
- Alfred Perlstein
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Critical Sections for userland.

2007-10-04 Thread Alfred Perlstein
* Wilko Bulte [EMAIL PROTECTED] [071004 04:15] wrote:
 Quoting Alfred Perlstein, who wrote on Thu, Oct 04, 2007 at 03:19:02AM -0700 
 ..
  * Dag-Erling Sm??rgrav [EMAIL PROTECTED] [071004 03:01] wrote:
   Alfred Perlstein [EMAIL PROTECTED] writes:
Do you have:
   
a) Evidence or a paper to prove that this is a bad idea?
   
   I need evidence or a paper to prove that it is a bad idea to allow a
   userland process to hold the CPU indefinitely?
   
b) A helpful suggestion?
   
   Why don't you tell us what you're actually trying to do, so we can tell
   you how to do it.
   
c) An obvious understanding of the problem?
   
   I'll show you mine if you show me yours.
  
  It's not worth my time to engage someone with your mind set, you
  posses neither the technical nor interpersonal skill to be useful
  to me.
 
 Gentlemen... please?

I think that it would behoove us to explain to developers that
perhaps helping the users instead of talking down to them would
probably catch more flies.

I'm really tired of being told what I need by people that do not
understand what environment I'm coming from.

By not fielding these annoying responses I come across as clueless
or perhaps even satiated by the non-help, conversely by answering
I am forced to play this game with people that I'm quite certain
will not help me even if they are somehow satisfied by the intellectual
or sadistic things they are trying to extract.

Even after making it clear with Kip what is needed, I have two more
developers toying/trolling/etc with me rather than helping me.

Here is how I handle the holy crap your idea is hair-brained, I
simply say, well if you do that, you'll be worse off because of
X, Y or Z and missing out on feature A, B or C, buuut, here's
how you might accomplish that

So not only do I come across as smart, but also somewhat helpful.

I think we should all know that too much of the former without the
latter does not paint us in a good light.

thank you,
-- 
- Alfred Perlstein
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Critical Sections for userland.

2007-10-04 Thread Alfred Perlstein
* Daniel Eischen [EMAIL PROTECTED] [071004 06:05] wrote:
 
 His point about telling us what you're really doing, so we might
 off other ways to do it is valid.
 
 We don't know why you are using homegrown user-level spinlocks
 instead of pthread mutexes.  Priority ceiling mutexes and running
 in SCHED_RR or SCHED_FIFO is really what tries to address this
 problem, at least from the vague desciption you give.  If you
 have tried this and they don't work correctly, then one solution
 is to fix them ;-)

First of all we're stuck on 6.x, how is threads on this platform?

Second off we are contending against other devices in the system
that do not run FreeBSD, How do we address that?

-- 
- Alfred Perlstein
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Critical Sections for userland.

2007-10-04 Thread Alfred Perlstein
* Dag-Erling Sm??rgrav [EMAIL PROTECTED] [071004 03:28] wrote:
 Alfred Perlstein [EMAIL PROTECTED] writes:
  It's not worth my time to engage someone with your mind set, you
  posses neither the technical nor interpersonal skill to be useful
  to me.
 
 This could be the beginning of a wonderful friendship...
 
  For context see my replies in this thread to Kip Macy which explains
  how one deals with the false-problems you mention.
 
 I did read them, and I'm not convinced at all.  You are asking for a
 large amount of complexity to be added to the system, but you refuse to
 tell us what you're actually trying to do.  Are you worried that we
 might actually figure out a way to do it without raping the scheduler?

As already explained by Kip, the goal is to avoid switching out a
lock owner due to quantum exhaustion at an inopportune time.

We have needs that may wind up not being applicable to FreeBSD,
however if we can accomplish this in a way that is not too awful
we would likely be sharing the code, no matter how annoying you
make it. :)

-- 
- Alfred Perlstein
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Critical Sections for userland.

2007-10-02 Thread Alfred Perlstein
Hi guys, we need critical sections for userland here.

This is basically to avoid a process being switched out while holding
a user level spinlock.

The way I envisioned doing this was as follows:

1) syscall that sets a pointer in the struct thread.
2) user mlocks that page.
3) when scheduler goes to switch out a process due to quantum it checks
   this pointer, if set it will give the process more time to run and
   not switch it out. (*)
4) the load would seem to have to be non-faulting.

So my questions are:

1) Where would be a good place to add this code in the scheduler and how?
2) How does one do a read/write to userland address, but if the access would
fault, then return an error rather than trap?  I'm quite sure the scheduling
decisions would be made inside of the timer interrupt, (am I right?), and
hence would not be allowed to fault in pages. 

(*) Note, we will implement limits to this so that a haywire application
is not able to be critical forever.

Any help would be appreciated.

-- 
- Alfred Perlstein
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Critical Sections for userland.

2007-10-02 Thread Alfred Perlstein
* Daniel Eischen [EMAIL PROTECTED] [071002 19:46] wrote:
 On Tue, 2 Oct 2007, Alfred Perlstein wrote:
 
 Hi guys, we need critical sections for userland here.
 
 This is basically to avoid a process being switched out while holding
 a user level spinlock.
 
 Setting the scheduling class to real-time and using SCHED_FIFO
 and adjusting the thread priority around the lock doesn't work?

Too heavy weight, we want to basically have this sort of code
in userland:

/* assume single threaded process for now */
static int is_critical;



atomic_mutex_lock();  /* implies ++is_critical */
 ...do stuff...
atomic_mutex_unlock(); /* implies --is_critical */

We don't want two or more syscalls per lock operation. :)

-- 
- Alfred Perlstein
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


  1   2   3   4   5   6   7   8   9   10   >