Re: which datastructure for fast sorted insert?

2008-05-25 Thread I V
On Sun, 25 May 2008 18:42:06 -0700, notnorwegian wrote: > def scrapeSites(startAddress): > site = startAddress > sites = set() > iterator = iter(sites) > pos = 0 > while pos < 10:#len(sites): > newsites = scrapeSite(site) > joinSets(sites, newsites) You change t

Re: which datastructure for fast sorted insert?

2008-05-25 Thread Gabriel Genellina
En Sun, 25 May 2008 22:42:06 -0300, <[EMAIL PROTECTED]> escribió: > def joinSets(set1, set2): > for i in set2: > set1.add(i) > return set1 Use the | operator, or |= > Traceback (most recent call last): > File "C:/Python25/Progs/WebCrawler/spider2.py", line 47, in > x = scr

Re: which datastructure for fast sorted insert?

2008-05-25 Thread sturlamolden
On May 25, 8:02 pm, Rodrigo Lazo <[EMAIL PROTECTED]> wrote: > what about heapq for sorting? Heap is the data structure to use for 'fast (nearly) sorted inserts'. But heapq do not support (as far as I know) deletion of duplicates. But a custom heap class coud do that of course. -- http://mail.pyt

Re: which datastructure for fast sorted insert?

2008-05-25 Thread notnorwegian
Traceback (most recent call last): File "C:/Python25/Progs/WebCrawler/spider2.py", line 47, in x = scrapeSites("http://www.yahoo.com";) File "C:/Python25/Progs/WebCrawler/spider2.py", line 31, in scrapeSites site = iterator.next() RuntimeError: Set changed size during iteration def j

Re: which datastructure for fast sorted insert?

2008-05-25 Thread notnorwegian
On 26 Maj, 03:04, [EMAIL PROTECTED] wrote: > On 26 Maj, 01:30, I V <[EMAIL PROTECTED]> wrote: > > > > > On Sun, 25 May 2008 15:49:16 -0700, notnorwegian wrote: > > > i meant like set[pos], not iterate but access a specific position in the > > > set. > > > If you need to access arbitrary elements, u

Re: which datastructure for fast sorted insert?

2008-05-25 Thread notnorwegian
On 26 Maj, 01:30, I V <[EMAIL PROTECTED]> wrote: > On Sun, 25 May 2008 15:49:16 -0700, notnorwegian wrote: > > i meant like set[pos], not iterate but access a specific position in the > > set. > > If you need to access arbitrary elements, use a list instead of a set > (but you'll get slower inserts

Re: which datastructure for fast sorted insert?

2008-05-25 Thread miller . paul . w
On May 25, 2:37 am, [EMAIL PROTECTED] wrote: > im writing a webcrawler. > after visiting a new site i want to store it in alphabetical order. > > so obv i want fast insert. i want to delete duplicates too. > > which datastructure is best for this? I think you ought to re-examine your requirements.

Re: which datastructure for fast sorted insert?

2008-05-25 Thread I V
On Sun, 25 May 2008 15:49:16 -0700, notnorwegian wrote: > i meant like set[pos], not iterate but access a specific position in the > set. If you need to access arbitrary elements, use a list instead of a set (but you'll get slower inserts). OTOH, if you just need to be able to get the next item

Re: which datastructure for fast sorted insert?

2008-05-25 Thread notnorwegian
On May 25, 9:32 am, Marc 'BlackJack' Rintsch <[EMAIL PROTECTED]> wrote: > On Sun, 25 May 2008 00:10:45 -0700, notnorwegian wrote: > > sets dont seem to be so good because there is no way to iterate them. > > Err: > > In [82]: for x in set(['a', 'b', 'c']): >    :     print x >    : > a > c

Re: which datastructure for fast sorted insert?

2008-05-25 Thread I V
On Sun, 25 May 2008 13:05:31 -0300, Gabriel Genellina wrote: > Use a list, and the bisect module to keep it sorted: That's worth doing if you need the data to be sorted after each insert. If the OP just needs the data to be sorted at the end, using a data structure with fast inserts (like a set)

Re: which datastructure for fast sorted insert?

2008-05-25 Thread Rodrigo Lazo
Stefan Behnel <[EMAIL PROTECTED]> writes: > [EMAIL PROTECTED] wrote: >> im writing a webcrawler. >> after visiting a new site i want to store it in alphabetical order. >> >> so obv i want fast insert. i want to delete duplicates too. >> >> which datastructure is best for this? > > Keep the data

Re: which datastructure for fast sorted insert?

2008-05-25 Thread Stefan Behnel
[EMAIL PROTECTED] wrote: > im writing a webcrawler. > after visiting a new site i want to store it in alphabetical order. > > so obv i want fast insert. i want to delete duplicates too. > > which datastructure is best for this? Keep the data redundantly in two data structures. Use collections.de

Re: which datastructure for fast sorted insert?

2008-05-25 Thread Terry Reedy
"Rares Vernica" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] | >>> l=list(s) | >>> l.sort() This can be condensed to l = sorted(s) | >>> l | ['a', 'b', 'c'] -- http://mail.python.org/mailman/listinfo/python-list

Re: which datastructure for fast sorted insert?

2008-05-25 Thread Gabriel Genellina
En Sun, 25 May 2008 03:37:00 -0300, <[EMAIL PROTECTED]> escribió: > im writing a webcrawler. > after visiting a new site i want to store it in alphabetical order. > > so obv i want fast insert. i want to delete duplicates too. > > which datastructure is best for this? Use a list, and the bisect m

Re: which datastructure for fast sorted insert?

2008-05-25 Thread Benjamin Kaplan
On Sun, May 25, 2008 at 3:10 AM, <[EMAIL PROTECTED]> wrote: > > > >>> l=list(s) > > >>> l.sort() > > >>> l > > > > ['a', 'b', 'c'] > > > > hth, > > Rares > > sets dont seem to be so good because there is no way to iterate them. > > s.pop() remove and return an arbitrary element fro

Re: which datastructure for fast sorted insert?

2008-05-25 Thread Marc 'BlackJack' Rintsch
On Sun, 25 May 2008 00:10:45 -0700, notnorwegian wrote: > sets dont seem to be so good because there is no way to iterate them. Err: In [82]: for x in set(['a', 'b', 'c']): : print x : a c b Ciao, Marc 'BlackJack' Rintsch -- http://mail.python.org/mailman/listinfo/pyth

Re: which datastructure for fast sorted insert?

2008-05-25 Thread notnorwegian
On 25 Maj, 08:56, Rares Vernica <[EMAIL PROTECTED]> wrote: > use a set to store them: > > >>> s=set() > >>> s.add('a') > >>> s.add('b') > >>> s > set(['a', 'b']) > >>> s.add('a') > >>> s > set(['a', 'b']) > >>> s.add('c') > >>> s > > set(['a', 'c', 'b']) > > > > it does remove duplicates, but is it

Re: which datastructure for fast sorted insert?

2008-05-25 Thread Rares Vernica
use a set to store them: >>> s=set() >>> s.add('a') >>> s.add('b') >>> s set(['a', 'b']) >>> s.add('a') >>> s set(['a', 'b']) >>> s.add('c') >>> s set(['a', 'c', 'b']) >>> it does remove duplicates, but is it not ordered. to order it you can use: >>> l=list(s) >>> l.sort() >>> l ['a', 'b', 'c']

which datastructure for fast sorted insert?

2008-05-24 Thread notnorwegian
im writing a webcrawler. after visiting a new site i want to store it in alphabetical order. so obv i want fast insert. i want to delete duplicates too. which datastructure is best for this? -- http://mail.python.org/mailman/listinfo/python-list