Re: Speedup cp command?

2016-08-24 Thread Thomas Schmitt
Hi,

i wrote:
> >xorriso -osirrox on:sort_lba_on:auto_chmod_on:o_excl_off \

Richard Owlett wrote:
> That works nicely.

xorriso packs them up, xorriso packs them out.


> Advantage over cp is progress messages.

It's more entertaining than watching cp -r. :))

In case one prefers silent operation, it is possible to restrict verbosity
to events of severity "WARNING" or worse:

  xorriso -report_about warning \
  -osirrox on:sort_lba_on:auto_chmod_on:o_excl_off \
  ...


> 3 of the 13 DVD's had problems with reading files and gave a fatal error
> message [thought I had saved it but I didn't] all successfully copied on
> retry.

In case you need to really fight for a damaged ISO, there would be ways
to combine the successfully read blocks from multiple runs of

  xorriso ...  -check_media  ...

to a single ISO image on disk and to inquire the paths of data files
which stay damaged.


Have a nice day :)

Thomas



Re: Speedup cp command?

2016-08-24 Thread Richard Owlett

On 8/24/2016 6:09 AM, Thomas Schmitt wrote:

Hi,

the proposed xorriso -extract run fails on mounted media unless
the -osirrox parameter string contains

   ":o_excl_off"

I.e.:

   xorriso -osirrox on:sort_lba_on:auto_chmod_on:o_excl_off \
   -indev /dev/cdrom \
   -extract / /media/richard/myrepository/dvd_1
[snip]


That works nicely. Advantage over cp is progress messages.
3 of the 13 DVD's had problems with reading files and gave a 
fatal error message [thought I had saved it but I didn't] all 
successfully copied on retry.




Re: Speedup cp command?

2016-08-24 Thread Thomas Schmitt
Hi,

> for a DVD (but not a CD), I think just cat without isosize
> would work as well.

Only for DVD-R[W] written with write type DAO and even that depends
on the burn program used. DVD have a natural chunk size of 32 KiB.
Only DAO respects your wish if you send smaller chunks.

Overwritable media (DVD-RAM, DVD+RW, BD-RE, formatted DVD-RW) in
most cases yield their full capacity when being read.


> files in
> ISO 9660 filesystems were written in sequence and should not cause much
> seeking.

One should think so. At least if data of tree neighbors have similar
block addresses. (Which is not the case in current Debian ISOs but seems
not to be the reason for the miserable performance of cp -r.)


> Any idea how they did manage that?

Well, in part "they" is me as software heir of Vreixo Formoso Lopes
who implented a mapping from directory tree to data content which is
deterministic but quite chaotic. A red-black tree is involved which
tries to find identical input files.

In GNU xorriso-1.3.8 as used by debian-cd, this red-black tree is
traversed for data content mapping. Since 1.4.2, the ISO 9660 name tree,
which is alphabetically sorted by specification, gives the sequence for
this mapping.
Only pitfall is that ISO 9660 names are uppercase and possibly mangled
versions of the Rock Ridge names, which mount(8) uses by default.
So some hopping might still be needed when walking the mounted file
tree alphabetically.

But as demonstrated, the more reasonable mapping does not significantly
improve the performance of cp -r.


> rsync would speed things up since it establishes a list of all the
> files to copy before starting

Let's try with the non-chaoticly repacked ISO.
Being first time user of rsync i have to guess and RTFM.

  # mount /dev/sr1 /mnt/iso

  $ time rsync --no-p --no-g --no-o --chmod=ugo=rwX -r \
   /mnt/iso /dvdbuffer/debian_dvd_1_unpacked
  skipping non-regular file "iso/debian"
  ...

Noise is very promising. Few clonking.
  
  real6m43.584s

That's 404 seconds. Not too bad, compared to 333 seconds for sequentially
copying the ISO to hard disk.

My rsync options still hate softlinks. But i guess this is not significant
for performance measurement.

rsync on the original DVD 1 needs 730 seconds. With much clonking.
cp -r needed 1048 seconds, osirrox without sort_lba needed 661 seconds.

So:
cp -r must be doing something that does not go well with the ISO 9660
driver of Linux.


Have a nice day :)

Thomas



Re: Speedup cp command?

2016-08-24 Thread Nicolas George
L'octidi 8 fructidor, an CCXXIV, Thomas Schmitt a écrit :
>   dd if=/dev/cdrom bs=1M count=$blocks of=/media/richard/myisos/dvd_1.iso

Useless use of dd. head -c will perform as well, without the need for
arithmetic. And, for a DVD (but not a CD), I think just cat without isosize
would work as well.

> A discussion on reproducible-builds a year ago yielded that the file
> content sorting order by libisofs did not match the sorting order of
> directory records in the tree. This was fixed by release 1.4.2.

While reading the beginning of your mail, I was about to point that files in
ISO 9660 filesystems were written in sequence and should not cause much
seeking. Obviously, you already knew it and thought of it.

Any idea how they did manage that? Naively, I imagine that creating the
directory index and then creating the file entries is done from the same
in-memory data structure after a single sort.

> Nevertheless it turns out that the layers of Debian GNU/Linux 8 still
> do a poor job. I repacked the ISO by xorriso-1.4.5 and verified that
> the data extents are sorted according to the sorting of the ECMA-119
> and Rock Ridge tree. Simple tree traversal or alphabetically sorted
> tree traversal would yield smooth reading, but cp -r has different ideas
> about sequence.

You can use -v to easily know the order cp choses. AFAIK, cp -r does no
sorting and uses the order from the kernel, and the kernel, for ISO 9660
uses the order in the directory data, so that should work ok.

Actually, since rsync does its own sorting, so it could lead to worse
results in this case.

Still, we are speaking of a Debian install CD: the bulk of the data should
be made of a pool directory with only subdirectories on one level containing
plain files. All with file names from the almost-portable character set (+
and ~ are used), in lowercase. There is not much room for sorting
discrepancies.

But IIRC, ISO 9660 stores all the directory structure first and only then
the files' payload. That could be an explanation, since cp -r reads the
directories as they come (and even, apparently, subdirectories after it has
copied the plain files). That could explain seeking:

readdir pool
readdir pool/a [no seeking]
copy pool/a/a-1.deb [seeking over the rest of the directory structure]
copy pool/a/a.orig.tar.gz [no seeking]
readdir pool/b [seeking back]
...

In that case, running "find /media/cdrom > /dev/null" repeatedly to keep the
whole structure of the hierarchy in the inodes and dentries cache could
speed things up.

And also, rsync would speed things up since it establishes a list of all the
files to copy before starting, and its sort order should yield the same
result with these particular file names. I do not have an optical drive at
hand to check.


signature.asc
Description: Digital signature


Re: Speedup cp command?

2016-08-24 Thread Thomas Schmitt
Hi,

the proposed xorriso -extract run fails on mounted media unless
the -osirrox parameter string contains

  ":o_excl_off"

I.e.:

  xorriso -osirrox on:sort_lba_on:auto_chmod_on:o_excl_off \
  -indev /dev/cdrom \
  -extract / /media/richard/myrepository/dvd_1

Else you will see an error message

  libburn : SORRY : Cannot open busy device '/dev/sr1' : Device or resource busy


I do not get this message because i circumvent the /dev/sr global
ioctl mutex by using /dev/sg instead. So i can operate more than one
drive at a time by ioctl(SG_IO).
(The mutex seems to be totally unnecessary. It was installed 6 years
 ago by a drive-by programmer who removed the Big Kernel Lock.
 Some of my users hacked their kernels to get rid of it.)


Have a nice day :)

Thomas



Re: Speedup cp command?

2016-08-24 Thread Thomas Schmitt
Hi,

Richard Owlett wrote:
> I used
>   cp -R /media/cdrom0 /media/richard/myrepository/dvd_1
> It gave me what I wanted [*N.B.* I did not want dvd_1.iso]
> It was SLOW.

An average DVD+RW can be read at about 10 MB/s average speed.
That would be about 7 minutes.
Reading usually is slower in the inner area and faster outwards.

A major slowdown is caused scattered random access. The optical head
moves to a new position quite slowly and often loudly.
Copying the plain ISO image does not involve random access.
Random access on hard disk or in RAM is much faster.


So an intermediate .iso might be the fastest vanilla way to get the
data from medium to disk.
Depending on your disk speed and RAM luxury, the additional cp -r for
unpacking might still end before a plain cp -r would have ended.

  blocks=$(expr $(/sbin/isosize /dev/cdrom) / 1024 / 1024 + 1)
  mkdir /media/richard/myisos
  dd if=/dev/cdrom bs=1M count=$blocks of=/media/richard/myisos/dvd_1.iso
  mkdir /mnt/iso
  mount -o loop /media/richard/myisos/dvd_1.iso /mnt/iso
  cp -r /mnt/iso /media/richard/myrepository/dvd_1
  umount /mnt/iso
  rm /media/richard/myisos/dvd_1.iso


If you do not have buffer space for the ISO or want to avoid the
intermediate steps, try this:

  xorriso -osirrox on:sort_lba_on:auto_chmod_on \
  -indev /dev/cdrom \
  -extract / /media/richard/myrepository/dvd_1

My measurements with DVD+RW on drive LG GH24NSC0:

  dd to .iso:   333 s

  cp -r :  1084 s despite lots of RAM ! Miserable noises from drive.

  osirrox   :   342 s with "sort_lba_on" which lets it read with
  monotonically ascending block addresses.

661 s without "sort_lba_on". Clonks less than cp -r.


debian-cd could get a file arrangement which is more friendly to copiers
if it would use a newer version of xorriso.
A discussion on reproducible-builds a year ago yielded that the file
content sorting order by libisofs did not match the sorting order of
directory records in the tree. This was fixed by release 1.4.2.

Nevertheless it turns out that the layers of Debian GNU/Linux 8 still
do a poor job. I repacked the ISO by xorriso-1.4.5 and verified that
the data extents are sorted according to the sorting of the ECMA-119
and Rock Ridge tree. Simple tree traversal or alphabetically sorted
tree traversal would yield smooth reading, but cp -r has different ideas
about sequence.

  cp -r   :  998 s  still clonking terribly.

  osirrox without sort_lba:  356 s  working smoothly.

(xorriso-1.4.6 is planned to come soon with more changes proposed by
 reproducible-builds. So i do not prod debian-cd now.)


Have a nice day :)

Thomas



Re: Speedup cp command?

2016-08-23 Thread David Wright
On Tue 23 Aug 2016 at 17:23:31 (-0400), Greg Wooledge wrote:
> On Tue, Aug 23, 2016 at 04:16:42PM -0500, Richard Owlett wrote:
> > Thanks. I'll try it as soon as copy of DVD#2 ends.
> > What's special about a loop mount in this circumstance? As I read 
> > the rsync man page it was pretty similar to cp and it had 
> > accepted a plain automount [I'm on Jessie with Mate DE]]
> 
> rsync vs. cp won't make any difference if the destination directory
> is empty.  In either case, you have to read every byte of input and
> write every byte of output.

Does it make any difference if the DVD drive and the hard drive
are master and slave on the same IDE controller?

Cheers,
David.



Re: Speedup cp command?

2016-08-23 Thread Nicolas George
> On Tue, Aug 23, 2016 at 03:18:30PM -0500, Richard Owlett wrote:
> > I'm copying Debian distribution DVDs.
> > I used
> >   cp -R /media/cdrom0 /media/richard/myrepository/dvd_1
> > 
> > It gave me what I wanted [*N.B.* I did not want dvd_1.iso]
> > It was SLOW.

Optical media are « SLOW ». You can check for yourself: measure the actual
speed (using for example « iostat 1 », or maybe « df
/media/richard/myrepository/dvd_1; sleep 100; df
/media/richard/myrepository/dvd_1 » and a calculator), and compare to the
official speed of your drive.

> > The man page for rsync suggested that it could do it faster.
> > Can it?

No.

> > If so, what is correct syntax to get the same result as the command above?

« rsync -r » instead of « cp -r » (or -R, they are synonyms for cp but not
for rsync), nothing more. But in this particular case, I would advise -a
instead of -r for both.

Andrew M.A. Cater a écrit :
> Loop mount the DVD
> 
> mount -t iso /media/cdrom0 -o loop /mnt

That does not make any sense. In this situation, /media/cdrom0 is a
directory. You can not do loopback on a directory.

If the data is on an optical medium, it must be read from the optical
medium, at the speed of the optical medium, there is no way around. Even if
you manage to convince the kernel to do loopback on something that is
already a block device, it will not help.


signature.asc
Description: Digital signature


Re: Speedup cp command?

2016-08-23 Thread Andrew M.A. Cater
On Tue, Aug 23, 2016 at 04:16:42PM -0500, Richard Owlett wrote:
> On 8/23/2016 4:05 PM, Andrew M.A. Cater wrote:
> >On Tue, Aug 23, 2016 at 03:18:30PM -0500, Richard Owlett wrote:
> >>I'm copying Debian distribution DVDs.
> >>I used
> >>   cp -R /media/cdrom0 /media/richard/myrepository/dvd_1
> >>
> >>It gave me what I wanted [*N.B.* I did not want dvd_1.iso]
> >>It was SLOW.
> >>The man page for rsync suggested that it could do it faster.
> >>Can it?
> >>If so, what is correct syntax to get the same result as the command above?
> >>
> >>TIA
> >
> >Loop mount the DVD
> >
> >mount -t iso /media/cdrom0 -o loop /mnt
> >
> >cd /mnt
> >
> >and you should see all the files within the DVD.
> >
> >cd /media/richard/repository/dvd_1/
> >
> >rsync -pavz /mnt/ .
> >
> >Should do it. If you stop and restart rsync, it should start from the place 
> >it left off,more or less.
> >
> >Hope this helps,
> >
> >AndyC
> 
> Thanks. I'll try it as soon as copy of DVD#2 ends.
> What's special about a loop mount in this circumstance? As I read the rsync
> man page it was pretty similar to cp and it had accepted a plain automount
> [I'm on Jessie with Mate DE]]
> 

Loop mount allows you to "see inside" the mounted media, effectively.

I've just noticed a typo - if you want to copy and paste the instructions, then 
for you, at least,
that will need to be

cd /media/richard/myrepository/dvd_1/ ; rsync -pavz /mnt/ .

[I missed out the "my" in myrepository :) ]

As someone else pointed out: you still need to copy everything so it can be 
slow especially on 4G - essentially you're probably copying from one 
portion of the disk to another so lots of reads and writes on the same disk. 
Rsync just allows you to start/stop much more readily.

All the best,

AndyC



Re: Speedup cp command?

2016-08-23 Thread Richard Owlett

On 8/23/2016 4:23 PM, Greg Wooledge wrote:

On Tue, Aug 23, 2016 at 04:16:42PM -0500, Richard Owlett wrote:

Thanks. I'll try it as soon as copy of DVD#2 ends.
What's special about a loop mount in this circumstance? As I read
the rsync man page it was pretty similar to cp and it had
accepted a plain automount [I'm on Jessie with Mate DE]]


rsync vs. cp won't make any difference if the destination directory
is empty.  In either case, you have to read every byte of input and
write every byte of output.


I can see that. I was wondering if rsync was using a better 
buffering scheme. That might make a difference when copying 
between rotating media.




rsync is tremendously useful when you've already got a partial copy
of the input.  It uses heuristics to figure out what it actually needs
to copy, and skips the parts you already have.





Re: Speedup cp command?

2016-08-23 Thread Greg Wooledge
On Tue, Aug 23, 2016 at 04:16:42PM -0500, Richard Owlett wrote:
> Thanks. I'll try it as soon as copy of DVD#2 ends.
> What's special about a loop mount in this circumstance? As I read 
> the rsync man page it was pretty similar to cp and it had 
> accepted a plain automount [I'm on Jessie with Mate DE]]

rsync vs. cp won't make any difference if the destination directory
is empty.  In either case, you have to read every byte of input and
write every byte of output.

rsync is tremendously useful when you've already got a partial copy
of the input.  It uses heuristics to figure out what it actually needs
to copy, and skips the parts you already have.



Re: Speedup cp command?

2016-08-23 Thread Richard Owlett

On 8/23/2016 4:05 PM, Andrew M.A. Cater wrote:

On Tue, Aug 23, 2016 at 03:18:30PM -0500, Richard Owlett wrote:

I'm copying Debian distribution DVDs.
I used
   cp -R /media/cdrom0 /media/richard/myrepository/dvd_1

It gave me what I wanted [*N.B.* I did not want dvd_1.iso]
It was SLOW.
The man page for rsync suggested that it could do it faster.
Can it?
If so, what is correct syntax to get the same result as the command above?

TIA


Loop mount the DVD

mount -t iso /media/cdrom0 -o loop /mnt

cd /mnt

and you should see all the files within the DVD.

cd /media/richard/repository/dvd_1/

rsync -pavz /mnt/ .

Should do it. If you stop and restart rsync, it should start from the place it 
left off,more or less.

Hope this helps,

AndyC


Thanks. I'll try it as soon as copy of DVD#2 ends.
What's special about a loop mount in this circumstance? As I read 
the rsync man page it was pretty similar to cp and it had 
accepted a plain automount [I'm on Jessie with Mate DE]]





Re: Speedup cp command?

2016-08-23 Thread Andrew M.A. Cater
On Tue, Aug 23, 2016 at 03:18:30PM -0500, Richard Owlett wrote:
> I'm copying Debian distribution DVDs.
> I used
>   cp -R /media/cdrom0 /media/richard/myrepository/dvd_1
> 
> It gave me what I wanted [*N.B.* I did not want dvd_1.iso]
> It was SLOW.
> The man page for rsync suggested that it could do it faster.
> Can it?
> If so, what is correct syntax to get the same result as the command above?
> 
> TIA

Loop mount the DVD

mount -t iso /media/cdrom0 -o loop /mnt

cd /mnt

and you should see all the files within the DVD.

cd /media/richard/repository/dvd_1/

rsync -pavz /mnt/ .

Should do it. If you stop and restart rsync, it should start from the place it 
left off,more or less.

Hope this helps,

AndyC



Speedup cp command?

2016-08-23 Thread Richard Owlett

I'm copying Debian distribution DVDs.
I used
  cp -R /media/cdrom0 /media/richard/myrepository/dvd_1

It gave me what I wanted [*N.B.* I did not want dvd_1.iso]
It was SLOW.
The man page for rsync suggested that it could do it faster.
Can it?
If so, what is correct syntax to get the same result as the 
command above?


TIA