Re: [Discuss] lvm snapshot cloning

2011-10-26 Thread markw
 On Tue, Oct 25, 2011 at 11:20 PM,  ma...@mohawksoft.com wrote:

 Obviously, there are pros and cons to various approaches. You approach
 is
 only faster if you know that the file in questions was recently
 modified.
 If that is the case, you lucked out. What happens if the source of the
 calculations, which is also needed, hasn't been modified in some time?


 Are you serious? If the source of the calculations hasn't been modified
 in some time, then what you do is restore it from the backups. Same as
 if it was modified recently.

I was thinking that it, if you are doing an incremental backup, it may not
be present on the current backup medium, and then you'd have to search
what ever catalogue system you have to find where it is.


 The older files weren't backed up recently because they hadn't changed.
 They didn't get erased from the backups, so they're still available for
 restoration, just like the recently changed files.

 Did you think that all backups older then one day were somehow erased?

Typically, you ship some backups off-site to ensure recovery.





 --
 John Abreau / Executive Director, Boston Linux  Unix
 OLD GnuPG KeyID: D5C7B5D9 / Email: abre...@gmail.com
 OLD GnuPG FP: 72 FB 39 4F 3C 3B D6 5B E0 C8 5A 6E F1 2C BE 99
 2011 GnuPG KeyID: 32A492D8 / Email: abre...@gmail.com
 2011 GnuPG FP:



___
Discuss mailing list
Discuss@blu.org
http://lists.blu.org/mailman/listinfo/discuss


Re: [Discuss] lvm snapshot cloning

2011-10-26 Thread Matt Shields
On Tue, Oct 25, 2011 at 10:16 PM, Richard Pieri richard.pi...@gmail.comwrote:

 On Oct 25, 2011, at 7:51 PM, ma...@mohawksoft.com wrote:
 
  The snapshot has no effect on the master, and yes, we've already said and
  we already know it is a weakness in LVM that if you don't extend your
  snapshots you lose them. This can be mitigated by monitoring and
 automatic
  volume extension.

 You missed it.  This isn't about what happens to master.  It's what happens
 to b when a disappears.  If master-a-b and a disappears due to reaping
 then b becomes useless.  Or b is reaped, too.  Either way you're dealing
 with data loss.  This is why LVM will not do what you originally asked
 about.

 Monitoring has problems.  If the volume fills up faster than the monitor
 polls capacity then you lose your data.  If the volume fills up faster than
 it can be extended then you lose your data.  If the volume cannot be
 extended because the volume group has no more extents available then you
 lose your data.  Like I wrote at the start: LVM will quite happily bite your
 face off.

 Now, to address your most recent question:

 How do I back up a 1TB disk.  Think about this: how do you intend to do a
 restore from this backup?  The most important part of a backup system is
 being able to restore from backup in a timely fashion.

 I have in production a compute server with two 8TB file systems and a 9TB
 file system, all sitting on LVM volumes.  I have an automated backup that
 runs every night on this server.  It's an incremental file system backup so
 I'm only backing up the changes every night.  This is, as you might expect,
 quite faster than trying to do full backups of 25TB every night -- which I
 can't because it would take three days to do it.

 On smaller capacity volumes, in the several hundred GB range, I use
 rsnapshot to do incremental file snapshots to a storage server.  Again, I
 don't back up the raw disk partitions every time.  I only back up the
 changed files.

 In both cases -- and in fact with all my backups -- they are file level
 backups.  The reason being that if I need to restore a single file or
 directory then I don't have to rebuild the entire volume to do so.  I can
 restore as little or as much as I need to recover from a mistake or a
 disaster.

 Suppose the case of a live volume that needs to be in a frozen state for
 doing a backup.  Database servers are prime examples of this.  Here, I would
 freeze the database, make a snapshot of the underlying volume, and then thaw
 the database.  Now I can do my backup of the read-only snapshot volume
 without interfering with the running system.  I would delete the snapshot
 when the backup is complete.

 If I were using plain LVM and ext3 for my users' home directories then I
 would do something similar with read-only snapshots.  There would be no
 freeze step, and I would keep several days worth of snapshots on the file
 server to make error recovery faster than going to tape or network storage.
  As it is, I use OpenAFS which has file system snapshots so I don't need to
 do any of this and users can go back in time just by looking in .clone in
 their home directories.  I still have nightly backups to tape for long-term
 archives.

 Now, time to poke holes in your proposal.  I have a physics graduate
 student doing his thesis research project on a shared compute server along
 with a dozen others.  They collectively have 7.5TB of data on there.  This
 is a real-world case on the aforementioned compute server.  Said student
 accidentally wipes out his entire thesis project, 200GB worth of files.
  It's 9:30 PM and he needs his files by 8am or he fails his thesis defense,
 doesn't graduate and I'm looking for a new job.

 With my file level backup system I can have his files restored within a
 couple of hours at the outside without affecting anyone else's work.

 With your volume level backup system I would spend the night on Monster
 looking for a new job.  The problem with it is that I can't restore
 individual files because it isn't individual files that are backed up.  It's
 the disk blocks.  I can't just drop those backed-up blocks onto the volume.
  Here:

  master-changes-changes-changes
   \-backup

 If I dumped the backup blocks onto the volume then I'd scramble the file
 system.  Restoration would require me to replicate the entire volume at the
 block level as it was when the backup was made.  This would destroy all the
 other researchers' work done in the past however many hours since that
 backup was made.  I would fire myself for gross incompetence if I were
 relying on this kind of backup system.  It's that bad.

 It gets worse.  What happens when the whole thing fails outright?  Total
 disaster on your 1TB disk.  Now it's not just 29 minutes to restore last
 night's blocks.  It's two hours to restore the initial replica and then 30
 minutes times however many deltas have been made.  Six deltas means 5 hours
 to do a full rebuild.  I can do 

Re: [Discuss] lvm snapshot cloning

2011-10-26 Thread markw
 On Oct 25, 2011, at 11:20 PM, ma...@mohawksoft.com wrote:

 Actually, in LVM 'a' and 'b' are completely independent, each have their
 own copy of the COW data. So, if 'a' gets nuked, 'b' is fine, and vice
 virca.

 Rather, this is how LVM works *because* of this situation.  If LVM
 supported snapshots of snapshots then it'd be trivially easy to shoot
 yourself in the foot.

Actually, I'm currently working on a system that snapshots of snapshots,
its not LVM, obviously, but it quietly resolves an interior copy being
removed or failing. Its a very enterprise system.


 If you know the behaviour of your system, you could allocate a large
 percentage of the existing volume size (even 100%) and mitigate any
 risk.
 You would get your snapshot quickly and still have full backing.

 So for your hypothetical 1TB disk, let's assume that you actually have 1TB
 of data on it.  You would need two more 1TB disks for each of the two
 snapshots.  This would be unscalable to my 25TB compute server.  I would
 need another 25TB+ to implement your scheme.  This is a case where I can
 agree that yes, it is possible to coerce LVM into doing it but that
 doesn't make it useful.

Well, we all know that disks do not change 100% very quickly or at all.
Its typically a very small percentage per day, even on active systems.

So the process is to backup diffs using analysis of two snapshots. A start
point and an end point. Just keep recycling the start point.



 In a disaster, well, brute force will save the day.

 My systems work for both individual files and total disaster.  I've proven
 it.

Yes, backups that maintain data integrity work. That's sort of the job.
The issue is reducing the amount of data that needs to be moved each time.

With a block level backup, you move only the blocks. With a file level
backup you move the whole files. Now, if the files are small, a file level
backup will make sense. If the files are large, like VMs or databases, a
block level backup makes sense.



 don't need to do any of this and users can go back in time just by
 looking
 in .clone in their home directories.  I still have nightly backups to
 tape
 for long-term archives.

 Seems complicated.

 It isn't.  It's a single AFS command to do the nightly snapshot and a
 second to run the nightly backup against that snapshot.



 totally wrong!!!

 lvcreate -s -n disaster -L1024G /dev/vg0/phddata
 (my utility)
 lvclonesnapshot /dev/mapper/vg0-phdprev-cow /dev/vg0/disaster

 This will apply historical changes to the /dev/vg0/disaster, the volume
 may then be used to restore data.

 Wait... wait... so you're saying that in order to restore some files I
 need to recreate the disaster volume, restore to it, and then I can copy
 files back over to the real volume?

I can't tell from the snip the whole example, but I think I was saying
that I could clone a snapshot, apply historical blocks to it, and then
you'd be able to get a specific version of a file from it. Yes.

If you are backing up many small files, rsync works well. If you are
backing up VMs, databases, or iSCSI targets, a block level strategy works
better.


 You have a similar issue with file system backups. You have to find the
 last time a particular file was backed up.

 Yes, and it should be MUCH faster!!! I agree.

 *Snrk*.  Neither of these are true.  I don't have to find anything.  I
 pick a point in time between the first backup and the most recent,
 inclusive, and restore whatever I need.  Everything is there.  I don't
 even need to look for tapes; TSM does all that for me.  400GB/hour
 sustained restore throughput is quite good for a network backup system.

400GB/hour? I doubt that number, but ok. It is still close to three hours.


 Remember, ease of restoration is the most important component of a backup
 system.  Yours, apparently, fails to deliver.

Not really, we have been discussing technology. We haven't even been
discussing user facing stuff.

The difference is what you plan to do, I guess. I'm not backing up many
small files.

Think of it this way. A 2TB drive is less than $80 and about $0.25 a month
in power. The economies open up a number of possibilities.

___
Discuss mailing list
Discuss@blu.org
http://lists.blu.org/mailman/listinfo/discuss


Re: [Discuss] Recommendation on Gigabit ISP for small business

2011-10-26 Thread Rich Braun
Hsuan-Yeh Chang hsuan...@gmail.com asked:
 Can anyone recommend an ISP (optical fiber) for 1-10 Gbps
 capability, particularly servicing small businesses in the
 Boston metro area?

Cogent (www.cogentco.com) and XO (www.xo.com) were the two I used most
recently, serving sites in Kendall Square and Waltham.  XO is better, Cogent
is cheaper. Any of the big national providers (search on Internet Traffic
Report) with a presence in the Boston area are able and willing to provide
service, if you're in an urban-core location and willing to put up with weeks
of phone calls from hungry sales-critters.

If you need to serve that much bandwidth, though, it's usually (much) less
expensive to locate the equipment in an established carrier-hotel data center
than at a small business in the Boston metro area.  The local-loop and
installation charges to bring fiber out to a yet-to-be-served location can be
substantial.

  If you can tell the price range, it is very much appreciated.  Thanks
 in advance!

100-megabit service was in the $500-700/mo range circa 2009, probably not a
lot different today.  GigE and above are by custom price quote.

-rich


___
Discuss mailing list
Discuss@blu.org
http://lists.blu.org/mailman/listinfo/discuss


Re: [Discuss] lvm snapshot cloning

2011-10-26 Thread Dan Ritter
On Tue, Oct 25, 2011 at 11:20:40PM -0400, ma...@mohawksoft.com wrote:
 
  Data point: It takes ~19 hours to restore 7.5TB from enterprise-class
  Tivoli Storage Manager over 1GB Ethernet to a 12x1TB SATA (3Gb/s) RAID 6
  volume.  I had to do it this past spring after the RAID controller on that
  volume went stupid and corrupted the whole thing.
 
 Yes, and it should be MUCH faster!!! I agree.


The bottleneck there is the the gigabit ethernet:

1000Mb/s * 3600s/h * B/8b = 45 MB/h or 450 GB/h. 
7500/450 = 16.7 hours

So the absolute best you could have done ignoring all overhead was 16.7
hours, and it took 19. Not awful.

On the other hand, going to 10GE doesn't move the bottleneck to
the disks -- if you can get 160MB/s per spindle and 10 of the 12
spindles effective, that's:

1600MB/s * 3600s/h = 5760GB/h which is still more than the
4500GB/h you might get from theoretical no-overhead 10GE.

Assume 10GE with the same 87% efficiency as GE and you get 3.9TB/h,
or about two hours to do your 7.5TB restore. OK, so 10GE is a win for
time, no surprise. However, GE is essentially free (your motherboards
have it, your switching infrastructure is in place) but 10GE involves
$600/port upgrades to the NIC (if you have room) and $1000/port switch
upgrades. Good thing to have next time around, probably not a routine
upgrade for most operations.

-dsr-

-- 
http://tao.merseine.nu/~dsr/eula.html is hereby incorporated by reference.
You can't fight for freedom by taking away rights.
___
Discuss mailing list
Discuss@blu.org
http://lists.blu.org/mailman/listinfo/discuss


Re: [Discuss] lvm snapshot cloning

2011-10-26 Thread Richard Pieri
On Wed, Oct 26, 2011 at 8:26 AM, ma...@mohawksoft.com wrote:

 400GB/hour? I doubt that number, but ok. It is still close to three hours.


Doubt if you want but I did it last spring.  7.5TB -- ~7500GB -- restored
over 19 hours and change.  That's 400GB/hour average throughput.  Over a
network.  From a tape-based storage system.  That's what an enterprise-class
backup system can do.

I see where you are going with this.  One of the historically canonical
examples is database servers that use raw disk for storage. A
block-incremental backup mechanism would have been very useful 15-20 years
ago when this was more common.  Today, hardly anyone does it this way.
 Today, standard procedures are to either dump the DB to a flat file and
back that up or to perform the freeze/snapshot/thaw cycle and back up the
snapshot.  This may not be the most efficient way to do it.  On the other
hand, if I have so much data to back up that efficiency would be an issue
then I already have sufficient resources in my backup system to make it not
be an issue.

You mention virtual machines.  I can see this if you want to back up the
containers.  The thing is, using an operating system's native tools is
always my preferred choice for making backups.  It may be inefficient at
times but it ensures that I can always recover the backup correctly.  More
generally, it avoids issues with being locked into specific technologies.
 Going back to the example of running a FreeBSD domU on a Linux dom0.  With
LVM-based block backups I am locked into using LVM-based block restore for
recovery.  I can't restore this domU onto a FreeBSD or Solaris dom0 or even
onto a real FreeBSD physical machine.  On the other hand, if I use OS tools
to do the backup then I can restore it anywhere that I please.

It is clear to me that we have different philosophies about backup systems.
 In my mind, efficiency is all well and good but it will always take a back
seat to the ease of restoring backups and the ease of creating them.  I've
had to recover from too many disasters and too many stupid mistakes -- some
of them my own -- to see it any other way.
___
Discuss mailing list
Discuss@blu.org
http://lists.blu.org/mailman/listinfo/discuss


[Discuss] A couple of git questions regarding tags

2011-10-26 Thread Jerry Feldman
I've been tagging my commits, and now I want to use that in my code. The
'git describe' command does this nicely:
In my sandbox I get '1.0.3-1-gf9a4796', but if I push (actually pull)
that into another repository I get '1.0.3'.
I would like to be able to display the version as well as the commit so
if someone points out a bug I know exactly. The describe in my sandbox
gives me pretty much what I want, and so does the 1.0.3. But I would
prefer the commit number (9a4796). Certainly I can parse the log to get
the current commit number.

The second question is that I would like a script that will do a git
commit as well as update the tag. I can either write this myself in
bash, tcl or python, or find a script out there that already does it.

-- 
Jerry Feldman g...@blu.org
Boston Linux and Unix
PGP key id:3BC1EB90 
PGP Key fingerprint: 49E2 C52A FC5A A31F 8D66  C0AF 7CEA 30FC 3BC1 EB90

___
Discuss mailing list
Discuss@blu.org
http://lists.blu.org/mailman/listinfo/discuss


Re: [Discuss] A couple of git questions regarding tags

2011-10-26 Thread Jerry Feldman
On 10/26/2011 01:23 PM, Jerry Feldman wrote:
 I've been tagging my commits, and now I want to use that in my code. The
 'git describe' command does this nicely:
 In my sandbox I get '1.0.3-1-gf9a4796', but if I push (actually pull)
 that into another repository I get '1.0.3'.
 I would like to be able to display the version as well as the commit so
 if someone points out a bug I know exactly. The describe in my sandbox
 gives me pretty much what I want, and so does the 1.0.3. But I would
 prefer the commit number (9a4796). Certainly I can parse the log to get
 the current commit number.

 The second question is that I would like a script that will do a git
 commit as well as update the tag. I can either write this myself in
 bash, tcl or python, or find a script out there that already does it.

One additional related question. Suppose the running code is
uncommitted, how easy is it to determine if the code has been committed.
This is probably more of a rhetorical issue since the only time this
would happen is in my sandbox, but I've had a couple of times where I've
forgotten to commit.

All the above can be accomplished fairly easily in a bash or tcl script.
I'm just looking to get some additional tools that might be hanging around.

-- 
Jerry Feldman g...@blu.org
Boston Linux and Unix
PGP key id:3BC1EB90 
PGP Key fingerprint: 49E2 C52A FC5A A31F 8D66  C0AF 7CEA 30FC 3BC1 EB90

___
Discuss mailing list
Discuss@blu.org
http://lists.blu.org/mailman/listinfo/discuss


Re: [Discuss] lvm snapshot cloning

2011-10-26 Thread Edward Ned Harvey
 From: ma...@mohawksoft.com [mailto:ma...@mohawksoft.com]

 If you can get more than 160MB/s (sustained) on anything other than exotic
 hardware, I'd be surprised. 1Gbit/sec per disk sustained is currently not
 possible with COTS hardware that is available.
 
 Transfer rate is not sustained, and peak is not sustained. Yes, if
 can can manage to read/write to disk cache, you can get cool performance,
 but if you are doing backups, you will blow out cache quite quickly.

Go measure it before you say anymore.  Because I've spent a lot of time in
the last 4 years benchmarking disks.  I can say the typical sequential
throughput, read or write, for nearly all disks (7.2krpm sata up to 15krpm
sas) is 1.0 Gbit/sec.  Sustained sequential read/write.  For let's say, the
entire disk.  Or at least tens of GB.

Even laptops (7.2krpm sata dell) are able to sustain this speed.

___
Discuss mailing list
Discuss@blu.org
http://lists.blu.org/mailman/listinfo/discuss


Re: [Discuss] lvm snapshot cloning

2011-10-26 Thread Dan Ritter
On Wed, Oct 26, 2011 at 03:18:06PM -0400, Edward Ned Harvey wrote:
  From: ma...@mohawksoft.com [mailto:ma...@mohawksoft.com]
 
  If you can get more than 160MB/s (sustained) on anything other than exotic
  hardware, I'd be surprised. 1Gbit/sec per disk sustained is currently not
  possible with COTS hardware that is available.
  
  Transfer rate is not sustained, and peak is not sustained. Yes, if
  can can manage to read/write to disk cache, you can get cool performance,
  but if you are doing backups, you will blow out cache quite quickly.
 
 Go measure it before you say anymore.  Because I've spent a lot of time in
 the last 4 years benchmarking disks.  I can say the typical sequential
 throughput, read or write, for nearly all disks (7.2krpm sata up to 15krpm
 sas) is 1.0 Gbit/sec.  Sustained sequential read/write.  For let's say, the
 entire disk.  Or at least tens of GB.
 
 Even laptops (7.2krpm sata dell) are able to sustain this speed.

Erm. 

1 Gb/s * 1024 M/G * 1B/8b = 128MB/s

Anything which can do
160 is clearly capable of doing 128.

You two are arguing in different directions.

-dsr-
___
Discuss mailing list
Discuss@blu.org
http://lists.blu.org/mailman/listinfo/discuss


Re: [Discuss] Verizon ADSL and port blocking

2011-10-26 Thread Bill Horne
On Wed, 2011-10-26 at 11:30 -0400, Gregory Boyce wrote:

 Turn off any local firewalling, and go to http://nmap-online.com/
 
 After a quick scan, you will likely see a mix of open, closed and
 filtered ports.  Anything open or filtered should be fine to use.
 Anything filtered is not.

I got various errors when I tried the site: Scan ran too long,
Forbidden, etc. If there are other ways to scan my IP, please tell me
how. 

TIA.

Bill


___
Discuss mailing list
Discuss@blu.org
http://lists.blu.org/mailman/listinfo/discuss


Re: [Discuss] Verizon ADSL and port blocking

2011-10-26 Thread Tom Metro
Bill Horne wrote:
 If there are other ways to scan my IP, please tell m how. 

ShieldsUP
https://www.grc.com/x/ne.dll?bh0bkyd2

 -Tom

-- 
Tom Metro
Venture Logic, Newton, MA, USA
Enterprise solutions through open source.
Professional Profile: http://tmetro.venturelogic.com/
___
Discuss mailing list
Discuss@blu.org
http://lists.blu.org/mailman/listinfo/discuss