Re: all estimate timed out

2013-04-05 Thread Chris Hoogendyk

Thank you!

Not sure why the debug file would list runtar in the form of a parameter, when it's not to be used 
as such. Anyway, that got it working.


Which brings me back to my original problem. As indicated previously, the filesystem in question 
only has 2806 files and 140 directories. As I watch the runtar in verbose mode, when it hits the tif 
files, it is taking 20 seconds on each tif file. The tif files are scans of herbarium type specimens 
and are pretty uniformly 200MB each. If I do a find on all the tif files, piped to `wc -l`, there 
are 1300 of them. Times 20 seconds each gives me the 26000 seconds that shows up in the sendsize 
debug file for this filesystem.


So, why would these tif files only be going by at 10MB/s into /dev/null? No compression involved. My 
(real) tapes run much faster than that. I also pointed out that I have more than a dozen other 
filesystems on the same zpool that are giving me no trouble (five 2TB drives in a raidz1 on a J4500 
with multipath SAS).


Any ideas how to speed that up?

I think I may start out by breaking them down into sub DLE's. There are 129 directories 
corresponding to taxonomic families.



On 4/4/13 8:05 PM, Jean-Louis Martineau wrote:

On 04/04/2013 02:48 PM, Chris Hoogendyk wrote:
I may just quietly go nuts. I'm trying to run the command directly. In the debug file, one 
example is:


Mon Apr  1 08:05:49 2013: thd-32a58: sendsize: Spawning "/usr/local/libexec/amanda/runtar runtar 
daily /usr/local/etc/amanda/tools/gtar --create --file /dev/null --numeric-owner --directory 
/export/herbarium --one-file-system --listed-incremental 
/usr/local/var/amanda/gnutar-lists/localhost_export_herbarium_1.new --sparse --ignore-failed-read 
--totals ." in pipeline


So, I created a script working off that and adding verbose:

   #!/bin/ksh

   OPTIONS=" --create --file /dev/null --numeric-owner --directory 
/export/herbarium
   --one-file-system --listed-incremental";
   OPTIONS="${OPTIONS} 
/usr/local/var/amanda/gnutar-lists/localhost_export_herbarium_1.new --sparse
   --ignore-failed-read --totals --verbose .";

   COMMAND="/usr/local/libexec/amanda/runtar runtar daily /usr/local/etc/amanda/tools/gtar 
${OPTIONS}";

   #COMMAND="/usr/sfw/bin/gtar ${OPTIONS}";


remove the 'runtar' argument



   exec ${COMMAND};


If I run that as user amanda, I get:

   runtar: Can only be used to create tar archives


If I exchange the two commands so that I'm using gtar directly rather than 
runtar, then I get:

   /usr/sfw/bin/gtar: Cowardly refusing to create an empty archive
   Try `/usr/sfw/bin/gtar --help' or `/usr/sfw/bin/gtar --usage' for more
   information.


--
---

Chris Hoogendyk

-
   O__   Systems Administrator
  c/ /'_ --- Biology & Geology Departments
 (*) \(*) -- 140 Morrill Science Center
~~ - University of Massachusetts, Amherst



---

Erdös 4



Re: all estimate timed out

2013-04-05 Thread Chris Hoogendyk
OK, folks, it is the "--sparse" option that Amanda is putting on the gtar. This is /usr/sfw/bin/tar 
version 1.23 on Solaris 10. I have a test script that runs the runtar and a test directory with just 
10 of the tif files in it.


Without the "--sparse" option, time tells me that it takes 0m0.57s to run the 
script.

With the "--sparse" option, time tells me that it takes 3m14.91s to run the 
script.

Scale that from 10 to 1300 tif files, and I have serious issues.

Now what? Can I tell Amanda not to do that? What difference will it make? Is 
this a bug in gtar?



On 4/5/13 11:09 AM, Chris Hoogendyk wrote:

Thank you!

Not sure why the debug file would list runtar in the form of a parameter, when it's not to be used 
as such. Anyway, that got it working.


Which brings me back to my original problem. As indicated previously, the filesystem in question 
only has 2806 files and 140 directories. As I watch the runtar in verbose mode, when it hits the 
tif files, it is taking 20 seconds on each tif file. The tif files are scans of herbarium type 
specimens and are pretty uniformly 200MB each. If I do a find on all the tif files, piped to `wc 
-l`, there are 1300 of them. Times 20 seconds each gives me the 26000 seconds that shows up in the 
sendsize debug file for this filesystem.


So, why would these tif files only be going by at 10MB/s into /dev/null? No compression involved. 
My (real) tapes run much faster than that. I also pointed out that I have more than a dozen other 
filesystems on the same zpool that are giving me no trouble (five 2TB drives in a raidz1 on a 
J4500 with multipath SAS).


Any ideas how to speed that up?

I think I may start out by breaking them down into sub DLE's. There are 129 directories 
corresponding to taxonomic families.



On 4/4/13 8:05 PM, Jean-Louis Martineau wrote:

On 04/04/2013 02:48 PM, Chris Hoogendyk wrote:
I may just quietly go nuts. I'm trying to run the command directly. In the debug file, one 
example is:


Mon Apr  1 08:05:49 2013: thd-32a58: sendsize: Spawning "/usr/local/libexec/amanda/runtar runtar 
daily /usr/local/etc/amanda/tools/gtar --create --file /dev/null --numeric-owner --directory 
/export/herbarium --one-file-system --listed-incremental 
/usr/local/var/amanda/gnutar-lists/localhost_export_herbarium_1.new --sparse 
--ignore-failed-read --totals ." in pipeline


So, I created a script working off that and adding verbose:

   #!/bin/ksh

   OPTIONS=" --create --file /dev/null --numeric-owner --directory 
/export/herbarium
   --one-file-system --listed-incremental";
   OPTIONS="${OPTIONS} 
/usr/local/var/amanda/gnutar-lists/localhost_export_herbarium_1.new --sparse
   --ignore-failed-read --totals --verbose .";

   COMMAND="/usr/local/libexec/amanda/runtar runtar daily /usr/local/etc/amanda/tools/gtar 
${OPTIONS}";

   #COMMAND="/usr/sfw/bin/gtar ${OPTIONS}";


remove the 'runtar' argument



   exec ${COMMAND};


If I run that as user amanda, I get:

   runtar: Can only be used to create tar archives


If I exchange the two commands so that I'm using gtar directly rather than 
runtar, then I get:

   /usr/sfw/bin/gtar: Cowardly refusing to create an empty archive
   Try `/usr/sfw/bin/gtar --help' or `/usr/sfw/bin/gtar --usage' for more
   information.




--
---

Chris Hoogendyk

-
   O__   Systems Administrator
  c/ /'_ --- Biology & Geology Departments
 (*) \(*) -- 140 Morrill Science Center
~~ - University of Massachusetts, Amherst



---

Erdös 4



Re: all estimate timed out

2013-04-05 Thread Brian Cuttler

Chris,

I don't know what tif files look like internally, don't know how
they compress.

Just of out left field... does your zpool have compression
enabled? I realized zfs will compress or not on a per block
basis, but I don't know what if any overhead is being incurred,
if the tif files are not compressed then there should be no
additional overhead to decompress them on read.

I would also probably hesitate to enable compression of a zfs
file system that was used for amanda work area, since you are
storing data that has already been zip'd. Though this also has
no impact on the estimate phase.

Our site has tended to gzip --fast, rather than --best, and have
on a few our our amanda servers moved to pigz. Again, potential
amdump issues but not amcheck issues.

Sanity check, the zpool itself is healthy? The drives are all of
the same architecture and spindle speeds?

good luck,

Brian


On Fri, Apr 05, 2013 at 11:09:16AM -0400, Chris Hoogendyk wrote:
> Thank you!
> 
> Not sure why the debug file would list runtar in the form of a parameter, 
> when it's not to be used as such. Anyway, that got it working.
> 
> Which brings me back to my original problem. As indicated previously, the 
> filesystem in question only has 2806 files and 140 directories. As I watch 
> the runtar in verbose mode, when it hits the tif files, it is taking 20 
> seconds on each tif file. The tif files are scans of herbarium type 
> specimens and are pretty uniformly 200MB each. If I do a find on all the 
> tif files, piped to `wc -l`, there are 1300 of them. Times 20 seconds each 
> gives me the 26000 seconds that shows up in the sendsize debug file for 
> this filesystem.
> 
> So, why would these tif files only be going by at 10MB/s into /dev/null? No 
> compression involved. My (real) tapes run much faster than that. I also 
> pointed out that I have more than a dozen other filesystems on the same 
> zpool that are giving me no trouble (five 2TB drives in a raidz1 on a J4500 
> with multipath SAS).
> 
> Any ideas how to speed that up?
> 
> I think I may start out by breaking them down into sub DLE's. There are 129 
> directories corresponding to taxonomic families.
> 
> 
> On 4/4/13 8:05 PM, Jean-Louis Martineau wrote:
> >On 04/04/2013 02:48 PM, Chris Hoogendyk wrote:
> >>I may just quietly go nuts. I'm trying to run the command directly. In 
> >>the debug file, one example is:
> >>
> >>Mon Apr  1 08:05:49 2013: thd-32a58: sendsize: Spawning 
> >>"/usr/local/libexec/amanda/runtar runtar daily 
> >>/usr/local/etc/amanda/tools/gtar --create --file /dev/null 
> >>--numeric-owner --directory /export/herbarium --one-file-system 
> >>--listed-incremental 
> >>/usr/local/var/amanda/gnutar-lists/localhost_export_herbarium_1.new 
> >>--sparse --ignore-failed-read --totals ." in pipeline
> >>
> >>So, I created a script working off that and adding verbose:
> >>
> >>   #!/bin/ksh
> >>
> >>   OPTIONS=" --create --file /dev/null --numeric-owner --directory 
> >>   /export/herbarium
> >>   --one-file-system --listed-incremental";
> >>   OPTIONS="${OPTIONS} 
> >>   /usr/local/var/amanda/gnutar-lists/localhost_export_herbarium_1.new 
> >>   --sparse
> >>   --ignore-failed-read --totals --verbose .";
> >>
> >>   COMMAND="/usr/local/libexec/amanda/runtar runtar daily 
> >>   /usr/local/etc/amanda/tools/gtar ${OPTIONS}";
> >>   #COMMAND="/usr/sfw/bin/gtar ${OPTIONS}";
> >
> >remove the 'runtar' argument
> >
> >>
> >>   exec ${COMMAND};
> >>
> >>
> >>If I run that as user amanda, I get:
> >>
> >>   runtar: Can only be used to create tar archives
> >>
> >>
> >>If I exchange the two commands so that I'm using gtar directly rather 
> >>than runtar, then I get:
> >>
> >>   /usr/sfw/bin/gtar: Cowardly refusing to create an empty archive
> >>   Try `/usr/sfw/bin/gtar --help' or `/usr/sfw/bin/gtar --usage' for more
> >>   information.
> 
> -- 
> ---
> 
> Chris Hoogendyk
> 
> -
>O__   Systems Administrator
>   c/ /'_ --- Biology & Geology Departments
>  (*) \(*) -- 140 Morrill Science Center
> ~~ - University of Massachusetts, Amherst
> 
> 
> 
> ---
> 
> Erdös 4
> 
---
   Brian R Cuttler brian.cutt...@wadsworth.org
   Computer Systems Support(v) 518 486-1697
   Wadsworth Center(f) 518 473-6384
   NYS Department of HealthHelp Desk 518 473-0773



Re: all estimate timed out

2013-04-05 Thread Jean-Louis Martineau

On 04/05/2013 12:09 PM, Chris Hoogendyk wrote:
OK, folks, it is the "--sparse" option that Amanda is putting on the 
gtar. This is /usr/sfw/bin/tar version 1.23 on Solaris 10. I have a 
test script that runs the runtar and a test directory with just 10 of 
the tif files in it.


Without the "--sparse" option, time tells me that it takes 0m0.57s to 
run the script.


With the "--sparse" option, time tells me that it takes 3m14.91s to 
run the script.


Scale that from 10 to 1300 tif files, and I have serious issues.

Now what? Can I tell Amanda not to do that? What difference will it 
make? Is this a bug in gtar?


Use the amgtar application instead of the GNUTAR program, it allow to 
disable the sparse option.


tar can't know where are the holes, it must read them.

You WANT the sparse option, otherwise your backup will be large because 
tar fill the holes with 0.


Your best option is to use the calcsize or server estimate.

Jean-Louis