Re: [Zope-dev] Request for comments: Directory storage
It would be great if you could do it, but beware that you will be benchmarking a lot of overhead if you only plan to measure storage performance. Why not use ZODB directly ? If I talk HTTP, it measures things fully - Python's interpreter lock will mean a storage system written in python will benchmark better without having to compete with ZServer, and vice versa for storage systems with non-pythonic bits. Yes, you are right. What filesystem does that use ? No idea :-) Something log based that is very fast and handles huge directories happily. It also appears that another member of this list has an EMC Symmetrix box to test on, which I believe is the next (and highest) level up from a Netapp. Mmmm... I heard that Network Appliance hired a couple of the SGI engineers that designed XFS ? I've attached a prerelease alpha of zouch.py for giggles. Not even a command line yet, so you will need to edit some code at the bottom. The current settings generate about 360 directories and about 36000 files, and proceeds to make about 18 reads. This bloated by test ZODB to just over 200MB and took about 2.6 hours attacking my development Zope server from another host on my LAN. Cool :) Thanks for writing this, it will be very useful for benchmarking. -Petru ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Request for comments: Directory storage
On Fri, 9 Jun 2000, Petru Paler wrote: I'd love some sort of benchmarking tool for this (and posibly other Storages). I guess the best way would a python script that uses urllib. Something that would algorithmically pump up the DB to 1GB in size and retrieve the URL's. Any volunteers or am I doing it in my copious spare time (tm)? It would be great if you could do it, but beware that you will be benchmarking a lot of overhead if you only plan to measure storage performance. Why not use ZODB directly ? If I talk HTTP, it measures things fully - Python's interpreter lock will mean a storage system written in python will benchmark better without having to compete with ZServer, and vice versa for storage systems with non-pythonic bits. I've got a nice NetApp here to run some tests on. What filesystem does that use ? No idea :-) Something log based that is very fast and handles huge directories happily. It also appears that another member of this list has an EMC Symmetrix box to test on, which I believe is the next (and highest) level up from a Netapp. I've attached a prerelease alpha of zouch.py for giggles. Not even a command line yet, so you will need to edit some code at the bottom. The current settings generate about 360 directories and about 36000 files, and proceeds to make about 18 reads. This bloated by test ZODB to just over 200MB and took about 2.6 hours attacking my development Zope server from another host on my LAN. Todo: tidy and vet ugly code command line interface dynamic option (do more intensive DTML stuff - currently just standard_html_header/standard_html_footer) catalog option (since DTML Documents arn't catalog aware, will need to make two calls to make a new document) upload larger documents and some binaries (200MB isn't great for benchmarking when you might have a gig of ram doing caching for you) standard test suite better reporting spinning dohicky so we know it hasn't hung without having to look at log files -- Stuart Bishop Work: [EMAIL PROTECTED] Senior Systems Alchemist Play: [EMAIL PROTECTED] Computer Science, RMIT University #!/bin/env python ''' $Id: zouch.py,v 1.3 2000/06/12 04:23:01 zen Exp $ Zouch - the Zope torture tester ''' import whrandom import sha import threading import ftplib import httplib from string import split,join,replace from time import time,strftime,localtime,sleep from StringIO import StringIO from Queue import Queue from threading import Thread,RLock from urllib import urlencode from urlparse import urlparse from base64 import encodestring retries = 10 retrysleep = 1 def debug(msg): print 'D: %s - %s' % (threading.currentThread().getName(),msg) # Fatal exceptions will not be caught class FatalException(Exception): pass class UnsupportedProtocol(FatalException): pass class FolderLock: def __init__(self): self.locks = {} self.sync = RLock() def lock(self,dirs): self._lock(self._mypath(dirs)) self._lock(self._parentpath(dirs)) def unlock(self,dirs): self._unlock(self._parentpath(dirs)) self._unlock(self._mypath(dirs)) def _parentpath(self,dirs): if len(dirs) == 1: return 'root' else: return join(dirs[:-1],'/') def _mypath(self,dirs): return join(dirs,'/') def _lock(self,d): locks = self.locks sync = self.sync while 1: try: sync.acquire() acq = 1 if locks.has_key(d): l = locks[d] sync.release() acq = 0 l.acquire() l.release() else: l = RLock() l.acquire() locks[d] = l break finally: if acq: sync.release() def _unlock(self,d): locks = self.locks sync = self.sync sync.acquire() try: l = locks[d] del locks[d] l.release() finally: sync.release() folderlock = FolderLock() class HTTPMaker: 'Baseclass for HTTP Maker classes' def __init__(self,queue,url,username,password): purl = urlparse(url) host,port = split(purl[1],':',1) path = purl[2] if port: port = int(port) else: port = 80 if path[-1] == '/': self.path = path else: self.path = path + '/' self.queue = queue self.ops = 0 if username is None: self.auth = None else: if password is None: password = '' self.auth = 'Basic %s' % \
Re: [Zope-dev] Request for comments: Directory storage
Petru Paler: This is the embodiment of my MutliFileStorage thingy on Jim's ZODB Wiki. I droped it (Never picked it up) when Mountable Storage was announced. I'll create a ReierFS partition some time this week and try it out. Excellent! Hello all, You probably saw my yesterday post with the first alpha of ReiserStorage. One of the questions that people tend to ask about it is wheter they can use it without reiserfs. There are two problems with not using reiserfs: 1. ReiserStorage (now renamed to DirectoryStorage) stores each object in a separate file and *all* the files in a single directory. This was done in order to let the filesystem what it was meant to do: store and retrieve files quickly. While reiserfs is *extremely* good at this (it uses a btree to store directory entries), most other filesystems do linear searches when finding a file so performance is very bad when you have many files in a single directory. This problem can be solved by splitting files into multiple directories when not using reiserfs. This would add a little overhead but it is tolerable. 2. Waste of space. Typical block-allocation filesystems like ext2 and FAT will waste alot of space in the usage pattern of DirectoryStorage. ReiserFS packs small files together in the btree, so it solves the problem, but I have no ideea how this could be fixed easyly on the other fs's. Comments ? Suggestions ? PS: a new DirectoryStorage release will be done today, with bugfixes and new features. -Petru ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope ) Jason Spisak CIO HireTechs.com 6151 West Century Boulevard Suite 900 Los Angeles, CA 90045 P. 310.665.3444 F. 310.665.3544 Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )