Re: Kernel thread stack size

2012-07-25 Thread Paul Ambrose
Could you be more specific about inefficient?
在 2012-7-25 上午11:22,Warner Losh i...@bsdimp.com写道:


 On Jul 24, 2012, at 6:40 PM, Paul Ambrose wrote:

  #define PAGE_SHIFT 12
  #define PAGE_SIZE  (1PAGE_SHIFT)
 
  #define KSTACK_PAGES 2
  #define KSTACK_GUARD_PAGES 2
 
  I had a MIPS machine (Loongson 3A) with page size 16KB( could be 4KB,
but
  had to handle cache alias in OS), IMHO,  define KSTACK_PAGE to 1 is
enough,
  what is your opinion?

 Well, the PTE has two entries, so having just one page would be
inefficient.

 Warner

  2012/7/23 Konstantin Belousov kostik...@gmail.com
 
  On Mon, Jul 23, 2012 at 02:54:30AM -0400, Richard Yao wrote:
  What is the default kernel thread stack size on FreeBSD? I am
  particularly interested in knowing about i386 and amd64, but knowing
  this for other architectures (such as MIPS) would also be useful.
 
 
  Look for the KSTACK_PAGES symbol defined in sys/arch/include/param.h.
  It defines _default_ number of pages allocated for kernel stack of
  new thread.
 
  We have 4 pages for amd64, and 2 pages for i386, AFAIR. Look up the
MIPS
  yourself.
 
  ___
  freebsd-hackers@freebsd.org mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
  To unsubscribe, send any mail to 
freebsd-hackers-unsubscr...@freebsd.org

could you be more specific about inefficient?
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Awful FreeBSD 9 block IO performance in KVM

2012-07-25 Thread Richard Yao
On 07/22/2012 03:19 AM, Wojciech Puchar wrote:
 You are right. It is not capped at that speed:

 root@freebsd:/root # dd if=/dev/zero of=/dev/da1 bs=16384 count=262144



 262144+0 records in
 262144+0 records out
 4294967296 bytes transferred in 615.840721 secs (6974153 bytes/sec)


 you did test da1 while dmesg are about da0?
 
 is it OK and da1 is another qemu-kvm vdisk?
 
 If so, check
 
 dd if=/dev/zero of=/dev/da1 bs=512 count=256k
 
 and compare speed.
 
 i bet at something near 250kB/s and i think it is long I/O service 
 pathlength in qemu-kvm SCSI device simulator.
 
 Just my bet i don't run FreeBSD on any VM (as opposed to running Windows 
 under FreeBSD in VBox)
 
 check out how much CPU is used on the host side when you do that test.
 ___
 freebsd-hackers@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

The test was on da1, but that does not explain how Linux manages to get
over 100MB/s when writing to a file on ext4.

I will do some more comprehensive tests as soon as I find time.



signature.asc
Description: OpenPGP digital signature


Re: libdwarf

2012-07-25 Thread Bob Bishop
Hi,

On 25 Jul 2012, at 00:45, Rayson Ho wrote:

 Hi,
 
 I need some changes in libdwarf, and I was wondering where I can
 discuss  ask devel related questions.

You should ask on the dwarf-discuss list, see http://dwarfstd.org/

 The original maintainer was John Birrell, who left us in 2009:
 https://blogs.oracle.com/bmc/entry/john_birrell
 
 Rayson

--
Bob Bishop
r...@gid.co.uk




___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Generic queue's KPI to manipulate mbuf's queue

2012-07-25 Thread Andre Oppermann

On 24.07.2012 20:18, Arnaud Lacombe wrote:

Hi,

AFAIK, there is no proper KPI for managing mbuf queue. All users have


Before we can talk about an mbuf queue you have to define what you
want to queue.  Is it packets or an mbuf chain which doesn't have
clear delimiters (as with tcp for example)?  Depending on that the
requirements and solutions may be vastly different.


to re-implements the queue logic from scratch, which is less than
optimal. From a preeminent FreeBSD developer at BSDCan 2009: we do
not need a new list implementation. There has been a few attempt of
providing a queue API, namely dev/cxgb/sys/mbufq.h, but that is
nothing more than an ad-hoc solution to something which _has_to_be_
generic. For the sake of adding more mess in the tree, this
implementation has been duplicated in dev/xen/netfront/mbufq.h...


Duplication is always a sign for the need of a generic approach/KPI.


Now, I understand, or at least merely witness without power, the
reluctance of kernel hackers to have 'struct mbuf` evolves, especially
wrt. their desire to keep binary compatibility of KPI[0]. Now, none of
the current ad-hoc API matched my needs, and I really did NOT want to
re-implement a new list implementation for missing basic operation,
such as deleting an element of the list, so I came with the attached
patch. The main idea is to be able to use already existing code from
sys/queue.h for mbuf queuing management. It is not the best which
can be done. I am not a huge fan of keeping `m_nextpkt' and
introducing a `m_nextelm', I would have preferred to use TAILQs, and I
do not like the dwelling in SLIST internal implementation details.
However, this change is relatively lightweight, and change neither ABI
or API.


IMO your change is a rather elegant way of introducing the LIST macros
to the mbuf nextpkt field.  I do like it and don't object to it providing
you sufficiently answer the question in the first paragraph.

--
Andre


Any comment appreciated.

  - Arnaud

[0]: taking care of having a stable kernel ABI and *not* a stable
userland ABI is beyond my understanding, but this is not the subject
of this mail.



___
freebsd-...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org



___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Pre-make and Post-make scripts

2012-07-25 Thread Brandon Falk
I'm curious as to how the best way of having a script run before and after
make would be done. Would this require modification of the FreeBSD make
template, or is this functionality already built in?

The goal of this is to set up a script that prior to a build sends up a
filesystem in ram (tmpfs) for the build to occur in, and afterwards saves
the filesystem from ram onto the actual disk for backup/archiving. We all
know that our poor disks could use a little break from the strain of build
processes.

-Brandon
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Pre-make and Post-make scripts

2012-07-25 Thread Garrett Cooper
On Wed, Jul 25, 2012 at 5:49 AM, Brandon Falk bfalk_...@brandonfa.lk wrote:
 I'm curious as to how the best way of having a script run before and after
 make would be done. Would this require modification of the FreeBSD make
 template, or is this functionality already built in?

 The goal of this is to set up a script that prior to a build sends up a
 filesystem in ram (tmpfs) for the build to occur in, and afterwards saves
 the filesystem from ram onto the actual disk for backup/archiving. We all
 know that our poor disks could use a little break from the strain of build
 processes.

Or just mount MAKEOBJDIRPREFIX (defaults to /usr/obj) in a swap
backed disk / tmpfs?
Cheers,
-Garrett
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Pre-make and Post-make scripts

2012-07-25 Thread Wojciech Puchar


   Or just mount MAKEOBJDIRPREFIX (defaults to /usr/obj) in a swap
backed disk / tmpfs?

i just do this.


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


geom - cam disk

2012-07-25 Thread Andriy Gapon


Preamble.  I am trying to understand in detail how things work at GEOM - CAM
disk boundary.  I am looking at scsi_da and ata_da which seem to be twins in
this respect.

I got an impression that the bioq_disksort calls in the strategy methods and the
related queues are completely useless in the GEOM single-threaded world.
There is only one thread, g_down, that can call a strategy method, the method
enqueues a bio, then calls a schedule function and through xpt_schedule the call
flow continues to a start method which dequeues the bio and off it goes.
I currently can see how a bio queue can accumulate more than one bio.

What am I missing? :-)
I will be very glad to learn more about this layer if anyone is willing to
educate me.
Thank you in advance.

P.S. I wrote a very simple to DTrace script to my theory experimentally and my
testing with various workloads didn't disprove the theory so far (which doesn't
mean that it is correct, of course).

The script:
fbt::bioq_disksort:entry
/args[0]-queue.tqh_first == 0/
{
@[empty] = count();
}

fbt::bioq_disksort:entry
/args[0]-queue.tqh_first != 0/
{
@[non-empty] = count();
}

It works on all bioq_disksort calls, but I stressing only ada disks at the 
moment.
-- 
Andriy Gapon

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: geom - cam disk

2012-07-25 Thread Alexander Motin

On 25.07.2012 23:27, Andriy Gapon wrote:

Preamble.  I am trying to understand in detail how things work at GEOM - CAM
disk boundary.  I am looking at scsi_da and ata_da which seem to be twins in
this respect.

I got an impression that the bioq_disksort calls in the strategy methods and the
related queues are completely useless in the GEOM single-threaded world.
There is only one thread, g_down, that can call a strategy method, the method
enqueues a bio, then calls a schedule function and through xpt_schedule the call
flow continues to a start method which dequeues the bio and off it goes.
I currently can see how a bio queue can accumulate more than one bio.

What am I missing? :-)
I will be very glad to learn more about this layer if anyone is willing to
educate me.
Thank you in advance.

P.S. I wrote a very simple to DTrace script to my theory experimentally and my
testing with various workloads didn't disprove the theory so far (which doesn't
mean that it is correct, of course).

The script:
fbt::bioq_disksort:entry
/args[0]-queue.tqh_first == 0/
{
 @[empty] = count();
}

fbt::bioq_disksort:entry
/args[0]-queue.tqh_first != 0/
{
 @[non-empty] = count();
}

It works on all bioq_disksort calls, but I stressing only ada disks at the 
moment.


Different controllers have different command queueing limitations. If 
you are testing with ahci(4) driver and modern disks, then their 32 
command slots per port can be enough for many workloads to enqueue all 
commands to the hardware and leave queue empty as you've described. But 
if you take harder workload, or controller/ device without command 
queueing support, extra requests will be accumulated on that bioq and 
sorted there.


--
Alexander Motin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: geom - cam disk

2012-07-25 Thread Scott Long
Once the bio is put into the bioq from da_strategy, the CAM scheduler is 
called.  It may or may not wind up calling dastart right away; if the simq or 
devq is frozen, or if the devq has been exhausted, then the io will be deferred 
until later and the call stack will unwind back into g_down.  The bioq can 
therefore accumulate many bio's before being drained.  Draining will usually 
happen from the camisr, at which point you can potentially have i/o being 
initiated from both the camisr and the g_down threads in parallel.  The 
monolithic locking in CAM right now prevents this from actually happening, 
though that's a topic that needs to be revisited.

Scott

On Jul 25, 2012, at 1:27 PM, Andriy Gapon wrote:

 
 
 Preamble.  I am trying to understand in detail how things work at GEOM - 
 CAM
 disk boundary.  I am looking at scsi_da and ata_da which seem to be twins in
 this respect.
 
 I got an impression that the bioq_disksort calls in the strategy methods and 
 the
 related queues are completely useless in the GEOM single-threaded world.
 There is only one thread, g_down, that can call a strategy method, the method
 enqueues a bio, then calls a schedule function and through xpt_schedule the 
 call
 flow continues to a start method which dequeues the bio and off it goes.
 I currently can see how a bio queue can accumulate more than one bio.
 
 What am I missing? :-)
 I will be very glad to learn more about this layer if anyone is willing to
 educate me.
 Thank you in advance.
 
 P.S. I wrote a very simple to DTrace script to my theory experimentally and 
 my
 testing with various workloads didn't disprove the theory so far (which 
 doesn't
 mean that it is correct, of course).
 
 The script:
 fbt::bioq_disksort:entry
 /args[0]-queue.tqh_first == 0/
 {
@[empty] = count();
 }
 
 fbt::bioq_disksort:entry
 /args[0]-queue.tqh_first != 0/
 {
@[non-empty] = count();
 }
 
 It works on all bioq_disksort calls, but I stressing only ada disks at the 
 moment.
 -- 
 Andriy Gapon
 

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: geom - cam disk

2012-07-25 Thread Andriy Gapon
on 26/07/2012 00:14 Scott Long said the following:
 Once the bio is put into the bioq from da_strategy, the CAM scheduler is
 called.  It may or may not wind up calling dastart right away; if the simq or
 devq is frozen, or if the devq has been exhausted, then the io will be
 deferred until later and the call stack will unwind back into g_down.  The
 bioq can therefore accumulate many bio's before being drained.  Draining will
 usually happen from the camisr, at which point you can potentially have i/o
 being initiated from both the camisr and the g_down threads in parallel.  The

Uh-hah.  Thank you for the answer.  I didn't think of the case of
frozen/exhausted queues and also didn't hit in my tests.
Now I am starting to understand the logic in xpt_run_dev_allocq.

BTW, I think that it would be nice if the GEOM work-processing could re-use the
CAM model.
That is, try to execute GEOM bio transformations in the original thread as much
as possible, defer work to the GEOM thread as the last resort.

 monolithic locking in CAM right now prevents this from actually happening,
 though that's a topic that needs to be revisited.


 On Jul 25, 2012, at 1:27 PM, Andriy Gapon wrote:
 
 
 
 Preamble.  I am trying to understand in detail how things work at GEOM -
 CAM disk boundary.  I am looking at scsi_da and ata_da which seem to be
 twins in this respect.
 
 I got an impression that the bioq_disksort calls in the strategy methods
 and the related queues are completely useless in the GEOM single-threaded
 world. There is only one thread, g_down, that can call a strategy method,
 the method enqueues a bio, then calls a schedule function and through
 xpt_schedule the call flow continues to a start method which dequeues the
 bio and off it goes. I currently can see how a bio queue can accumulate
 more than one bio.
 
 What am I missing? :-) I will be very glad to learn more about this layer
 if anyone is willing to educate me. Thank you in advance.
 
 P.S. I wrote a very simple to DTrace script to my theory experimentally
 and my testing with various workloads didn't disprove the theory so far
 (which doesn't mean that it is correct, of course).
 
 The script: fbt::bioq_disksort:entry /args[0]-queue.tqh_first == 0/ { 
 @[empty] = count(); }
 
 fbt::bioq_disksort:entry /args[0]-queue.tqh_first != 0/ { @[non-empty] =
 count(); }
 
 It works on all bioq_disksort calls, but I stressing only ada disks at the
 moment. -- Andriy Gapon
 
 


-- 
Andriy Gapon


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: geom - cam disk

2012-07-25 Thread Andriy Gapon
on 26/07/2012 01:08 Alexander Motin said the following:
 Different controllers have different command queueing limitations. If you are
 testing with ahci(4) driver and modern disks, then their 32 command slots per
 port can be enough for many workloads to enqueue all commands to the hardware
 and leave queue empty as you've described. But if you take harder workload, or
 controller/ device without command queueing support, extra requests will be
 accumulated on that bioq and sorted there.

Alexander,

thank you for the reply.
Indeed, using 64 parallel dd processes with bs=512 I was able to 'kick in' the
disksort logic.  But I am not sure if the disksort algorithm makes much
difference in this case given the number of commands that a disk firmware can
internally re-order.  (Not mentioning that potentially disksort could starve
some I/O bound processes in favor of others -- but that's a totally different
topic).

But then, of course, for the less capable hardware the disksort could still be a
significant factor.

-- 
Andriy Gapon


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: geom - cam disk

2012-07-25 Thread Warner Losh

On Jul 25, 2012, at 4:29 PM, Andriy Gapon wrote:
 BTW, I think that it would be nice if the GEOM work-processing could re-use 
 the
 CAM model.
 That is, try to execute GEOM bio transformations in the original thread as 
 much
 as possible, defer work to the GEOM thread as the last resort.

Lots of people would like to see this.  Especially people that want high iops.

Warner

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: geom - cam disk

2012-07-25 Thread Steven Hartland
- Original Message - 
From: Andriy Gapon a...@freebsd.org



on 26/07/2012 01:08 Alexander Motin said the following:

Different controllers have different command queueing limitations. If you are
testing with ahci(4) driver and modern disks, then their 32 command slots per
port can be enough for many workloads to enqueue all commands to the hardware
and leave queue empty as you've described. But if you take harder workload, or
controller/ device without command queueing support, extra requests will be
accumulated on that bioq and sorted there.


Alexander,

thank you for the reply.
Indeed, using 64 parallel dd processes with bs=512 I was able to 'kick in' the
disksort logic.  But I am not sure if the disksort algorithm makes much
difference in this case given the number of commands that a disk firmware can
internally re-order.  (Not mentioning that potentially disksort could starve
some I/O bound processes in favor of others -- but that's a totally different
topic).

But then, of course, for the less capable hardware the disksort could still be a
significant factor.


The sort is actually important for delete requests too as this can allow
the delete processing code to operate more effectively which can result in
significant performance increases if this then allows request combining.

For example Alexander is currently reviewing some changes I've written to the
delete processing which include an optimisation that increases SSD delete
performance from ~630MB/s to 1.3GB/s on 3rd gen sandforce controllers.

   Regards
   Steve


This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 


In the event of misdirection, illegible or incomplete transmission please 
telephone +44 845 868 1337
or return the E.mail to postmas...@multiplay.co.uk.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Generic queue's KPI to manipulate mbuf's queue

2012-07-25 Thread Arnaud Lacombe
Hi,

On Wed, Jul 25, 2012 at 7:25 AM, Andre Oppermann an...@freebsd.org wrote:
 On 24.07.2012 20:18, Arnaud Lacombe wrote:

 Hi,

 AFAIK, there is no proper KPI for managing mbuf queue. All users have
 Before we can talk about an mbuf queue you have to define what you
 want to queue.  Is it packets or an mbuf chain which doesn't have
 clear delimiters (as with tcp for example)?  Depending on that the
 requirements and solutions may be vastly different.

I was thinking about queues as in general use-case of m_nextpkt,
that would be dummynet queuing, QoS, various reassembly queues, socket
buffer, etc...

 to re-implements the queue logic from scratch, which is less than
 optimal. From a preeminent FreeBSD developer at BSDCan 2009: we do
 not need a new list implementation. There has been a few attempt of
 providing a queue API, namely dev/cxgb/sys/mbufq.h, but that is
 nothing more than an ad-hoc solution to something which _has_to_be_
 generic. For the sake of adding more mess in the tree, this
 implementation has been duplicated in dev/xen/netfront/mbufq.h...

 Duplication is always a sign for the need of a generic approach/KPI.


 Now, I understand, or at least merely witness without power, the
 reluctance of kernel hackers to have 'struct mbuf` evolves, especially
 wrt. their desire to keep binary compatibility of KPI[0]. Now, none of
 the current ad-hoc API matched my needs, and I really did NOT want to
 re-implement a new list implementation for missing basic operation,
 such as deleting an element of the list, so I came with the attached
 patch. The main idea is to be able to use already existing code from
 sys/queue.h for mbuf queuing management. It is not the best which
 can be done. I am not a huge fan of keeping `m_nextpkt' and
 introducing a `m_nextelm', I would have preferred to use TAILQs, and I
 do not like the dwelling in SLIST internal implementation details.
 However, this change is relatively lightweight, and change neither ABI
 or API.

 IMO your change is a rather elegant way of introducing the LIST macros
 to the mbuf nextpkt field.  I do like it and don't object to it providing
 you sufficiently answer the question in the first paragraph.

actually, I made a mistake selecting SLISTs, it should really be an
STAILQ. It has the same advantage wrt. ABI, and most usage made of
`m_nextpkt' follows a tail queue logic. The only advantage of TAILQ
would be reverse traversal, and time constant removal of inner
elements.

 - Arnaud

 --
 Andre

 Any comment appreciated.

   - Arnaud

 [0]: taking care of having a stable kernel ABI and *not* a stable
 userland ABI is beyond my understanding, but this is not the subject
 of this mail.



 ___
 freebsd-...@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-net
 To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org