Re: How git performs when you throw all of Debian at it

2013-09-02 Thread Philipp Kern
On Mon, Sep 02, 2013 at 11:27:33AM +0200, Tollef Fog Heen wrote:
> ]] Luca Filipozzi 
> > On Sat, Aug 31, 2013 at 12:40:08AM +0200, Michael Stapelberg wrote:
> > > Luca Filipozzi  writes:
> > > > Why do you say that when you haven't even asked?
> > > Because I thought the answer was going to be “not in the Linux kernel,
> > > no chance”.
> > We also run kfreebsd, with some challenge, but we have it.
> It's not really an option for codesearch since codesearch needs systemd,
> though.

Also I'm not sure how ZFS would solve the basic constraints of hardware
and be a magic bullet. The index can already be compressed in the application
layer. Deduplication adds the need to use a lot of RAM for the dedup tables.
The L2ARC of ZFS is commonly put on SSDs we don't have. So it'd basically yield
the need to add a lot of RAM to the VM for a reasonably sized ZFS in-memory
cache and the dedup tables for probably not much gain in avoided seeks (if git
delta dedup doesn't even find that much, why would block-level debug work
better). It's not that it wasn't proposed to instead throw RAM onto more page
cache (which could yield similar savings), which was equally considered a bad
choice.

Kind regards
Philipp Kern


signature.asc
Description: Digital signature


Re: How git performs when you throw all of Debian at it

2013-09-02 Thread Philipp Kern
On Sun, Sep 01, 2013 at 09:07:47PM +0100, Stephen Gran wrote:
> luca is not asking, "why aren't you using the new shiny", he's asking,
> "why hasn't a proposal for a project that does string searching used one
> of the already available, off the shelf string searching programs".
> 
> Asking someone why they didn't choose off the shelf tools to do a job
> that is largely a solved problem is a reasonable thing to do, especially
> when being asked to support new, custom code that is certain to come
> with its own quirks, bugs, and security issues.
> 
> I think the answer here is, "more code could be reused by borrowing work
> from google" - the NIH syndrome appears to be google's, rather than
> Michael's, if I understand correctly.  That's probably a fine answer,
> but the question also needed to be asked.

Fair point. My personal guess is that Solr's support for regex search
was not up to par at that point. codesearch is not about simple string
searching, but about regex as well, which was long something that wasn't
very performant because it couldn't be indexed efficiently.

The few bits I read up about Solr + regex seem to imply that your
indices somewhat explode in size (because it has to look at more context
than for simple strings), but that may or may not be true anymore.

Kind regards
Philipp Kern


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130902125357.ga10...@hub.kern.lc



Re: DSA, kFreeBSD, and the Singularity (was: How git performs when you throw all of Debian at it)

2013-09-02 Thread Peter Palfrader
On Sun, 01 Sep 2013, Steven Chamberlain wrote:

> I can only recall one wishlist bug from DSA at the moment which is
> #711247 requesting pflogd.  I'd love to hear more wishlist kfreebsd
> ideas from DSA.

syslog-ng on kfreebsd doesn't properly reconnect to logservers after
they went a way for a while or the network dropped.

Would be nice if that could be fixed.
-- 
   |  .''`.   ** Debian **
  Peter Palfrader  | : :' :  The  universal
 http://www.palfrader.org/ | `. `'  Operating System
   |   `-http://www.debian.org/


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130902121637.gc14...@anguilla.noreply.org



Re: How git performs when you throw all of Debian at it

2013-09-02 Thread Ian Jackson
Tollef Fog Heen writes ("Re: How git performs when you throw all of Debian at 
it"):
> ]] Luca Filipozzi 
> > We also run kfreebsd, with some challenge, but we have it.
> 
> It's not really an option for codesearch since codesearch needs systemd,
> though.

Why is that, JOOI ?

Ian.


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/21028.26486.458404.471...@chiark.greenend.org.uk



Re: How git performs when you throw all of Debian at it

2013-09-02 Thread Tollef Fog Heen
]] Luca Filipozzi 

> On Sat, Aug 31, 2013 at 12:40:08AM +0200, Michael Stapelberg wrote:
> > Luca Filipozzi  writes:
> > > Why do you say that when you haven't even asked?
> > Because I thought the answer was going to be “not in the Linux kernel,
> > no chance”.
> 
> We also run kfreebsd, with some challenge, but we have it.

It's not really an option for codesearch since codesearch needs systemd,
though.

-- 
Tollef Fog Heen
UNIX is user friendly, it's just picky about who its friends are


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/877gezefp6@qurzaw.varnish-software.com



Re: DSA, kFreeBSD, and the Singularity (was: How git performs when you throw all of Debian at it)

2013-09-02 Thread Luca Filipozzi
On Sun, Sep 01, 2013 at 09:49:30PM +0100, Steven Chamberlain wrote:
> Hi Luca!
> 
> On Sat, 31 Aug 2013 16:12:11 +, Luca Filipozzi wrote:
> > We also run kfreebsd, with some challenge, but we have it.
> 
> How can we help with that?

http://dsa.debian.org/ports/kfreebsd/

> e.g. Do you need it to better suit running as a virtualised guest, on
> KVM or Xen for example?  virtio drivers should be forthcoming for jessie
> and a XENHVM flavour is possible if it seems worth it.

We use ganeti (kvm) for virtualization.  Yes, better guest support would be
great.

More equivalency between linux and kfreebsd userland, at least as far as puppet
is concerned, would help.  For example, we like ferm, but ferm is iptables
only.

> I can only recall one wishlist bug from DSA at the moment which is
> #711247 requesting pflogd.  I'd love to hear more wishlist kfreebsd
> ideas from DSA.

See above url with incomplete list.

-- 
Luca Filipozzi
http://www.crowdrise.com/SupportDebian


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130902071431.ga1...@emyr.net



Re: DSA, kFreeBSD, and the Singularity (was: How git performs when you throw all of Debian at it)

2013-09-01 Thread Steven Chamberlain
Hi Luca!

On Sat, 31 Aug 2013 16:12:11 +, Luca Filipozzi wrote:
> We also run kfreebsd, with some challenge, but we have it.

How can we help with that?

e.g. Do you need it to better suit running as a virtualised guest, on
KVM or Xen for example?  virtio drivers should be forthcoming for jessie
and a XENHVM flavour is possible if it seems worth it.

I can only recall one wishlist bug from DSA at the moment which is
#711247 requesting pflogd.  I'd love to hear more wishlist kfreebsd
ideas from DSA.

I'd like it to be readily available whenever it might be a good fit for
some internal project.  Having as much Debian software as possible
feeding back into Debian development systems, makes the project seem
something of an unstoppable machine approaching the Singularity...

Regards,
-- 
Steven Chamberlain
ste...@pyro.eu.org


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/5223a85...@pyro.eu.org



Re: How git performs when you throw all of Debian at it

2013-09-01 Thread Stephen Gran
This one time, at band camp, Philipp Kern said:
> On Sat, Aug 31, 2013 at 04:12:11PM +, Luca Filipozzi wrote:
> > I'm curious why there's no apparent appetite for hdfs / solr / etc.
> 
> I don't know how far regex matching is with solr these days. This
> implementation is AFAIK based on [1]. But then the tool exists and
> would need to be thrown away completely for no obvious gain. How would
> you store the solr index? How would lookups be faster than a custom
> built index where it's basically known how many seeks you need per
> request?
> 
> It's akin asking you to port everything to Chef just for the sake of
> it, except that the target language would actually be Java, which
> likely consumes even more resources than the Go binary.

That's looking at it upside down, I think.  chef was the new shiny after
we'd already started down the puppet road.  Michael's implementation did
not, in fact, exist before solr.

luca is not asking, "why aren't you using the new shiny", he's asking,
"why hasn't a proposal for a project that does string searching used one
of the already available, off the shelf string searching programs".

Asking someone why they didn't choose off the shelf tools to do a job
that is largely a solved problem is a reasonable thing to do, especially
when being asked to support new, custom code that is certain to come
with its own quirks, bugs, and security issues.

I think the answer here is, "more code could be reused by borrowing work
from google" - the NIH syndrome appears to be google's, rather than
Michael's, if I understand correctly.  That's probably a fine answer,
but the question also needed to be asked.

Cheers,
-- 
 -
|   ,''`.Stephen Gran |
|  : :' :sg...@debian.org |
|  `. `'Debian user, admin, and developer |
|`- http://www.debian.org |
 -


signature.asc
Description: Digital signature


Re: How git performs when you throw all of Debian at it

2013-08-31 Thread Ben Hutchings
On Sat, 2013-08-31 at 22:22 +0200, Adam Borowski wrote:
> On Sat, Aug 31, 2013 at 12:44:21AM +0100, Ben Hutchings wrote:
> > > Funny that you ask.  What's the usual competitor for ZFS?
> > > btrfs is included in stock kernels, doesn't take massive amounts of 
> > > memory,
> > > and has a different approach to deduplication.
> > 
> > and is slower than any of its competitors.
> 
> time (tar xfa /usr/src/linux-source-3.11-rc4.tar.xz;sync)
> ext4:  32.002s
> btrfs: 18.770s
> (an old spinning disk, just had a failure of my primary one)
> 
> As Ceph backend:
> http://ceph.com/uncategorized/argonaut-vs-bobtail-performance-preview/

On the other hand:
http://article.gmane.org/gmane.comp.file-systems.xfs.general/54140
http://www.phoronix.com/scan.php?page=article&item=linux_310_10fs&num=2

> On the other hand, unless you deal with the fsync madness somehow, dpkg
> performance is, before kernel 3.5, worse than abysmal, and on 3.5+, merely
> bad.  Yet even on ext4, getting rid of fsync gives a massive speedup, so
> you want to wrap apt/dpkg with eatmydata where possible[1].  With no fsyncs,
> btrfs slightly wins.
> 
> 
> Thus: it depends on your particular usage pattern.  With filesystems of
> so different design principles it's hard to tell which is faster: there
> are cases when btrfs beats competition by a lot, there are cases where
> it gets beaten.
> 
> 
> > > Recent kernels are needed only for race-free deduplication, "cp --reflink"
> > > works in oldstable.
> > 
> > Please don't suggest using the squeeze or wheezy version of btrfs in
> > production.
> 
> 2.6.32 (squeeze) has serious problems in ENOSPC conditions (up to a panic,
> but no data loss) and works otherwise, what's the problem with 3.2?

~/src/d-k/dists/wheezy/linux$ grep -r 'BUG_ON(ret)' fs/btrfs | wc -l
290

So there are still 290 instances where an error will crash the system.
Some will be cases where the caller has established a precondition that
means the called function can't fail.  Surely not all of them, though.

(And in 3.11, there are still 100 of these...)

Many of the many bug fixes I see cc'd to stable appear to be fixing bugs
that exist in 3.2, but due to intervening changes they can't be applied
without backporting.  No-one's providing the backported versions, so the
bugs don't get fixed.

> Sure, fsync speed-ups from 3.5 and send/receive from 3.6 are good to have,
> but by no means necessary.
> 
> Distributions whose commercial support specifically mentions btrfs (Oracle,
> SUSE) are using 3.0 and 3.1.
[...]

Bet they have lots of backported fixes, though.

Ben.

-- 
Ben Hutchings
The most exhausting thing in life is being insincere. - Anne Morrow Lindberg


signature.asc
Description: This is a digitally signed message part


Re: How git performs when you throw all of Debian at it

2013-08-31 Thread Adam Borowski
On Sat, Aug 31, 2013 at 12:44:21AM +0100, Ben Hutchings wrote:
> > Funny that you ask.  What's the usual competitor for ZFS?
> > btrfs is included in stock kernels, doesn't take massive amounts of memory,
> > and has a different approach to deduplication.
> 
> and is slower than any of its competitors.

time (tar xfa /usr/src/linux-source-3.11-rc4.tar.xz;sync)
ext4:  32.002s
btrfs: 18.770s
(an old spinning disk, just had a failure of my primary one)

As Ceph backend:
http://ceph.com/uncategorized/argonaut-vs-bobtail-performance-preview/

On the other hand, unless you deal with the fsync madness somehow, dpkg
performance is, before kernel 3.5, worse than abysmal, and on 3.5+, merely
bad.  Yet even on ext4, getting rid of fsync gives a massive speedup, so
you want to wrap apt/dpkg with eatmydata where possible[1].  With no fsyncs,
btrfs slightly wins.


Thus: it depends on your particular usage pattern.  With filesystems of
so different design principles it's hard to tell which is faster: there
are cases when btrfs beats competition by a lot, there are cases where
it gets beaten.


> > Recent kernels are needed only for race-free deduplication, "cp --reflink"
> > works in oldstable.
> 
> Please don't suggest using the squeeze or wheezy version of btrfs in
> production.

2.6.32 (squeeze) has serious problems in ENOSPC conditions (up to a panic,
but no data loss) and works otherwise, what's the problem with 3.2?
Sure, fsync speed-ups from 3.5 and send/receive from 3.6 are good to have,
but by no means necessary.

Distributions whose commercial support specifically mentions btrfs (Oracle,
SUSE) are using 3.0 and 3.1.

I second the recommendation to use newer kernels, but I see no reason to
discourage using it with wheezy.

[Disclaimer: while I use btrfs on a bunch of machines, I never touched its
internal RAID, so I can vouch only for single block device use (including
hw and md RAID).]


[1]. On ext4, this should be at least debootstrap (no data to lose yet) and
pbuilder/piuparts (a throw-away chroot).  On btrfs using eatmydata on apt is
safe even on the host system, albeit not in the default configuration: you'd
need snapshots you can roll back to.

-- 
ᛊᚨᚾᛁᛏᚣ᛫ᛁᛊ᛫ᚠᛟᚱ᛫ᚦᛖ᛫ᚹᛖᚨᚲ


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130831202231.ga26...@angband.pl



Re: How git performs when you throw all of Debian at it

2013-08-31 Thread Philipp Kern
On Sat, Aug 31, 2013 at 04:12:11PM +, Luca Filipozzi wrote:
> I'm curious why there's no apparent appetite for hdfs / solr / etc.

I don't know how far regex matching is with solr these days. This
implementation is AFAIK based on [1]. But then the tool exists and would need
to be thrown away completely for no obvious gain. How would you store the solr
index? How would lookups be faster than a custom built index where it's
basically known how many seeks you need per request?

It's akin asking you to port everything to Chef just for the sake of it,
except that the target language would actually be Java, which likely
consumes even more resources than the Go binary.

Michael was never opposed to sharding. HDFS is apparently a high-throughput FS,
so not particularly suitable for random access if you try to avoid SSDs at all
cost anyway.

Kind regards
Philipp Kern

[1] http://swtch.com/~rsc/regexp/regexp4.html
 


signature.asc
Description: Digital signature


Re: How git performs when you throw all of Debian at it

2013-08-31 Thread Luca Filipozzi
On Sat, Aug 31, 2013 at 12:40:08AM +0200, Michael Stapelberg wrote:
> Luca Filipozzi  writes:
> > Why do you say that when you haven't even asked?
> Because I thought the answer was going to be “not in the Linux kernel,
> no chance”.

We also run kfreebsd, with some challenge, but we have it.

> > To address this specific thread, the challenge with ZFS is not that we don't
> > like the idea (I'm keen on it, actually) but that it's not in the
> > Linux kernel.
> See.
> 
> I don’t have any energy to spend on this right now. Don’t expect
> answers, I am killing this thread in my mailclient.

Okay.

I'm curious why there's no apparent appetite for hdfs / solr / etc.

-- 
Luca Filipozzi
http://www.crowdrise.com/SupportDebian


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130831161211.ga30...@emyr.net



Re: How git performs when you throw all of Debian at it

2013-08-30 Thread Adam Borowski
On Sat, Aug 31, 2013 at 12:32:47AM +0100, Dmitrijs Ledkovs wrote:
> On 30 August 2013 20:55, Steven Chamberlain  wrote:
> > Hi,
> >
> >> [...] using git instead of the file system for storing the contents
> >> of Debian Code Search. The hope was that it would lead to fewer disk
> >> seeks and less data due to gits delta-encoding
> >
> > Wouldn't ZFS be a more natural way to do something like this?
> >
> > A choice of gzip, lzjb and more recently lz4 compression;  snapshots
> > and/or deduplication both reduce the amount of disk blocks and cache
> > memory needed.
> >
> > I've pondered before at this overlap in functionality between packing by
> > Git, and those features of the ZFS filesystem.  They are doing much the
> > same thing but with different granularity.  It would be neat if they
> > could work together better.
> 
> I haven't finished packaging bedup - btrfs deduplication tool.

bedup is only an userspace tool that calculates per-file hashes, then uses
chattr tricks to avoid a race condition if some other process tried to write
to the file.  git renders the first part not needed: hashes are already
known.  If you're the only writer, you don't need to care about write races
either.

> Anybody have benchmarked that, if that's any good and/or comparable to zfs
> deduplication?

It's an apples to microsofts comparison: zfs takes a massive amount of
memory to store block hashes.  This has an upside: duplicated data never
hits the actual disk, and a downside: the memory cannot be used for
anything else, and if hashes hit the disk things become really slow.
With btrfs, unless you know the hash beforehand (git), deduplication
works after a write.  This might be months later (one-shot), during the
night (cron) or, if *notify is used, a small fraction of second later.
btrfs can enumerate recently changed blocks for you so there's no need
to read the whole disk in that cron job.

-- 
ᛊᚨᚾᛁᛏᚣ᛫ᛁᛊ᛫ᚠᛟᚱ᛫ᚦᛖ᛫ᚹᛖᚨᚲ


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130831001654.ga16...@angband.pl



Re: How git performs when you throw all of Debian at it

2013-08-30 Thread Ben Hutchings
On Sat, 2013-08-31 at 01:21 +0200, Adam Borowski wrote:
> On Fri, Aug 30, 2013 at 10:14:25PM +, Luca Filipozzi wrote:
> > On Fri, Aug 30, 2013 at 10:49:48PM +0200, Michael Stapelberg wrote:
> > > Steven Chamberlain  writes:
> > > > Wouldn't ZFS be a more natural way to do something like this?
> > > Possibly, but I have zero hopes of getting it set up and supported by
> > > DSA, so we can’t use it for this service.
> > 
> > To address this specific thread, the challenge with ZFS is not that we don't
> > like the idea (I'm keen on it, actually) but that it's not in the Linux 
> > kernel.
> > We prefer to use stock Debian kernels than custom-built kernels or modules 
> > for
> > our machines.
> > 
> > Is there another filesystem (or another approach) that would improve
> > performance?
> 
> Funny that you ask.  What's the usual competitor for ZFS?
> btrfs is included in stock kernels, doesn't take massive amounts of memory,
> and has a different approach to deduplication.

and is slower than any of its competitors.  (But at least it doesn't
have an incompatible licence.)

> Recent kernels are needed only for race-free deduplication, "cp --reflink"
> works in oldstable.

Please don't suggest using the squeeze or wheezy version of btrfs in
production.

Ben.

-- 
Ben Hutchings
If God had intended Man to program,
we'd have been born with serial I/O ports.


signature.asc
Description: This is a digitally signed message part


Re: How git performs when you throw all of Debian at it

2013-08-30 Thread Dmitrijs Ledkovs
On 30 August 2013 20:55, Steven Chamberlain  wrote:
> Hi,
>
>> [...] using git instead of the file system for storing the contents
>> of Debian Code Search. The hope was that it would lead to fewer disk
>> seeks and less data due to gits delta-encoding
>
> Wouldn't ZFS be a more natural way to do something like this?
>
> A choice of gzip, lzjb and more recently lz4 compression;  snapshots
> and/or deduplication both reduce the amount of disk blocks and cache
> memory needed.
>
> I've pondered before at this overlap in functionality between packing by
> Git, and those features of the ZFS filesystem.  They are doing much the
> same thing but with different granularity.  It would be neat if they
> could work together better.

I haven't finished packaging bedup - btrfs deduplication tool. Anybody
have benchmarked that, if that's any good and/or comparable to zfs
deduplication? lzo compression is also available. And well available
in linux kernel.

Regards,

Dmitrijs.


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/canbhluibdvtshc5dbtwobf67xdexnjpumafcpvx1auj37te...@mail.gmail.com



Re: How git performs when you throw all of Debian at it

2013-08-30 Thread Adam Borowski
On Fri, Aug 30, 2013 at 10:14:25PM +, Luca Filipozzi wrote:
> On Fri, Aug 30, 2013 at 10:49:48PM +0200, Michael Stapelberg wrote:
> > Steven Chamberlain  writes:
> > > Wouldn't ZFS be a more natural way to do something like this?
> > Possibly, but I have zero hopes of getting it set up and supported by
> > DSA, so we can’t use it for this service.
> 
> To address this specific thread, the challenge with ZFS is not that we don't
> like the idea (I'm keen on it, actually) but that it's not in the Linux 
> kernel.
> We prefer to use stock Debian kernels than custom-built kernels or modules for
> our machines.
> 
> Is there another filesystem (or another approach) that would improve
> performance?

Funny that you ask.  What's the usual competitor for ZFS?
btrfs is included in stock kernels, doesn't take massive amounts of memory,
and has a different approach to deduplication.

Recent kernels are needed only for race-free deduplication, "cp --reflink"
works in oldstable.

-- 
ᛊᚨᚾᛁᛏᚣ᛫ᛁᛊ᛫ᚠᛟᚱ᛫ᚦᛖ᛫ᚹᛖᚨᚲ


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130830232112.gb16...@angband.pl



Re: How git performs when you throw all of Debian at it

2013-08-30 Thread Holger Levsen
Hi,

On Samstag, 31. August 2013, Luca Filipozzi wrote:
> We are much more amenable to "how could I improve X so that it can run
> faster" than we are to...

I'd like to chime in and say a big thanks to DSA! piuparts.debian.org is 
running very very smoothly (and fast) now, thanks to the awesome ganeti 
cluster. And requests are usually answered+done very fast if they are sensible 
and if not, I usually learn nicely.

/me bows. Thanks a lot & keep up the good work!

(And these new schroot chroots are the awesome too!)


cheers,
Holger

P.S.: they also take patches, why else would there be
http://anonscm.debian.org/gitweb/?p=users/holger/debian.org.git ? :)


signature.asc
Description: This is a digitally signed message part.


Re: How git performs when you throw all of Debian at it

2013-08-30 Thread Michael Stapelberg
Hi Luca,

Luca Filipozzi  writes:
> Why do you say that when you haven't even asked?
Because I thought the answer was going to be “not in the Linux kernel,
no chance”.

> To address this specific thread, the challenge with ZFS is not that we don't
> like the idea (I'm keen on it, actually) but that it's not in the
> Linux kernel.
See.

I don’t have any energy to spend on this right now. Don’t expect
answers, I am killing this thread in my mailclient.

I am sorry for what I wrote and did not mean to offend anybody.

Have a nice weekend.

-- 
Best regards,
Michael


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/x6y57izttj@midna.lan



Re: How git performs when you throw all of Debian at it

2013-08-30 Thread Luca Filipozzi
On Fri, Aug 30, 2013 at 10:49:48PM +0200, Michael Stapelberg wrote:
> Steven Chamberlain  writes:
> > Wouldn't ZFS be a more natural way to do something like this?
> Possibly, but I have zero hopes of getting it set up and supported by
> DSA, so we can’t use it for this service.

Why do you say that when you haven't even asked?

Significant issues for DSA are:

- service owners saying "I need to unpack the entire archive and i need to
  search it fast so get more hardware like SSDs" or equivalent without
  understanding our constraints
  
- service owners not engaging with us early in their implementation to
  understand what constraints we might need to impose or to sound us out on
  architecture to improve performance or to reduce resource usage or to reduce
  software complexity

We are much more amenable to "how could I improve X so that it can run faster"
than we are to "zero hopes that DSA will support it".  Disparaging us certainly
does not draw bees to honey, as they say.

To address this specific thread, the challenge with ZFS is not that we don't
like the idea (I'm keen on it, actually) but that it's not in the Linux kernel.
We prefer to use stock Debian kernels than custom-built kernels or modules for
our machines.  If we can overcome these challenges for ZFS (which are
significant, I admit), we would be amenable to a discussion regarding ZFS
(although it isn't a great match to our current hardware configuration).

Is there another filesystem (or another approach) that would improve
performance?  Are there other things we could consider (hdfs & solr, say)?

You know where we live,

Luca

-- 
Luca Filipozzi
http://www.crowdrise.com/SupportDebian


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130830221425.ga19...@emyr.net



Re: How git performs when you throw all of Debian at it

2013-08-30 Thread Steven Chamberlain
On 30/08/13 21:49, Michael Stapelberg wrote:
> Steven Chamberlain  writes:
>> Wouldn't ZFS be a more natural way to do something like this?
> Possibly, but I have zero hopes of getting it set up and supported by
> DSA, so we can’t use it for this service.

Oh I see.  That's fair enough, but there is some hope that could change
someday:  ZFS-on-Linux is making good progress recently, and we
(GNU/kFreeBSD team and upstream developers of ZFS) can try to better
educate folks on ZFS generally.

Regards,
-- 
Steven Chamberlain
ste...@pyro.eu.org


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/52211337.7050...@pyro.eu.org



Re: How git performs when you throw all of Debian at it

2013-08-30 Thread Michael Stapelberg
Hi Steven,

Steven Chamberlain  writes:
> Wouldn't ZFS be a more natural way to do something like this?
Possibly, but I have zero hopes of getting it set up and supported by
DSA, so we can’t use it for this service.

-- 
Best regards,
Michael


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/x61u5a29ar@midna.lan



Re: How git performs when you throw all of Debian at it

2013-08-30 Thread Steven Chamberlain
Hi,

> [...] using git instead of the file system for storing the contents
> of Debian Code Search. The hope was that it would lead to fewer disk
> seeks and less data due to gits delta-encoding

Wouldn't ZFS be a more natural way to do something like this?

A choice of gzip, lzjb and more recently lz4 compression;  snapshots
and/or deduplication both reduce the amount of disk blocks and cache
memory needed.

I've pondered before at this overlap in functionality between packing by
Git, and those features of the ZFS filesystem.  They are doing much the
same thing but with different granularity.  It would be neat if they
could work together better.

Regards,
-- 
Steven Chamberlain
ste...@pyro.eu.org


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/5220f898.6000...@pyro.eu.org