Re: Scaling issue

2010-05-20 Thread Henrik Sarvell
I've summed up the result of this thread here: http://picolisp.com/5000/-2-I.html with some explanations. /Henrik On Fri, May 14, 2010 at 8:59 AM, Henrik Sarvell hsarv...@gmail.com wrote: OK since I can't rely on sorting by date anyway let's forget that idea. Yes since it seemed I had to

Re: Scaling issue

2010-05-14 Thread Alexander Burger
On Thu, May 13, 2010 at 09:12:06PM +0200, Henrik Sarvell wrote: One thing first though, since articles are indexed when they're parsed and PL isn't doing any kind of sorting automatically on insert then they should be sorted by date automatically with the latest articles at the end of the

Re: Scaling issue

2010-05-14 Thread Henrik Sarvell
OK since I can't rely on sorting by date anyway let's forget that idea. Yes since it seemed I had to involve dates anyway I simply chose a date far back enough in time that if someone is looking for something they might as well use Google. Anyway the above is scanning 19 remotes containing

Re: Scaling issue

2010-05-13 Thread Henrik Sarvell
Everything is running smoothly now, I intend to make a write up on the wiki this weekend maybe on this. One thing first though, since articles are indexed when they're parsed and PL isn't doing any kind of sorting automatically on insert then they should be sorted by date automatically with the

Re: Scaling issue

2010-05-13 Thread Henrik Sarvell
See my prior post for context. I've been testing a few different approaches and this is the fastest so far= : (de getArticles (W) (let Goal (goal (quote @Word W (select (@Wcs) ((word +WordCount @Word)) (same @Word @Wcs

Re: Scaling issue

2010-05-11 Thread Alexander Burger
On Mon, May 10, 2010 at 11:50:52PM +0200, Henrik Sarvell wrote: My code simply stops executing (as if waiting for the next entry but it never gets it) when I run out of entries to fetch, really strange and a traceAll confirms this, the last output is a call to rd1. What happens on the remote

Re: Scaling issue

2010-05-10 Thread Henrik Sarvell
Ah I see, so the issue is on the remote side then, what did your code look like there, did you use (prove)? On Mon, May 10, 2010 at 7:22 AM, Alexander Burger a...@software-lab.de wro= te: Hi Henrik, One final question, how did you define the rd1 mechanism? In the mentioned case, I used the

Re: Scaling issue

2010-05-10 Thread Alexander Burger
On Mon, May 10, 2010 at 09:04:48AM +0200, Henrik Sarvell wrote: Ah I see, so the issue is on the remote side then, what did your code look like there, did you use (prove)? There were several scenarios. In cases where only a few hits are to be expected, I used 'collect': (for Obj (collect

Re: Scaling issue

2010-05-09 Thread Henrik Sarvell
One final question, how did you define the rd1 mechanism? Simply doing: (dm rd1 (Sock) (in Sock (rd))) will read the whole result, not just the first result, won't it? I'm a little bit confused since it says in the reference that rd will read the first item from the current input channel

Re: Scaling issue

2010-05-09 Thread Alexander Burger
Hi Henrik, One final question, how did you define the rd1 mechanism? In the mentioned case, I used the followin method in the +Agent class (dm rd1 (Sock) (when (assoc Sock (: socks)) (rot (: socks) (index @ (: socks))) (ext (: ext) (or (in

Re: Scaling issue

2010-04-25 Thread Henrik Sarvell
So I gather the *Ext mapping is absolutely necessary regardless of whether remote or ext is used. I took at the *Ext section again, could I use this maybe: (setq *Ext # Define extension functions (mapcar '((@Host @Port @Ext) (let Sock NIL (cons @Ext

Re: Scaling issue

2010-04-25 Thread Alexander Burger
On Sun, Apr 25, 2010 at 03:17:55PM +0200, Henrik Sarvell wrote: So I gather the *Ext mapping is absolutely necessary regardless of whether remote or ext is used. Yes. Only in case you do not intend to communicate whole objects between the remote and local application, but only scalar data like

Re: Scaling issue

2010-04-25 Thread Henrik Sarvell
Ah so the key is to have the connections in a list, I should have understood that. Thanks for the help, I'll try it out! On Sun, Apr 25, 2010 at 4:51 PM, Alexander Burger a...@software-lab.dewrote: On Sun, Apr 25, 2010 at 03:17:55PM +0200, Henrik Sarvell wrote: So I gather the *Ext mapping

Re: Scaling issue

2010-04-21 Thread Henrik Sarvell
One small question before I start working. Now all remote database start sending their results, ordered by date. They are actually busy only until the TCP queue fills up, or until the connection is closed. If the queue is filled up, they will block so that it is advisable that they are all

Re: Scaling issue

2010-04-21 Thread Alexander Burger
On Wed, Apr 21, 2010 at 06:35:30PM +0200, Henrik Sarvell wrote: At first my remotes will be on the same machine so yes they could all be forked from the main process. That's all right. On the other hand, the remote processes might be different _programs_ (i.e. starting from a separate 'main',

Re: Scaling issue

2010-04-20 Thread Henrik Sarvell
I've been reading up a bit on the remote stuff, I haven't made the articles distributed yet but let's assume I have, with 10 000 articles per remote. Let's also assume that I have remade the word indexes to now work with real +Ref +Links on each remote that links words and articles (not simply

Re: Scaling issue

2010-04-15 Thread Henrik Sarvell
On the other hand, if I'm to follow my own thinking to its logical conclusion I should make the articles distributed too, with blobs and all. On Wed, Apr 14, 2010 at 9:51 PM, Henrik Sarvell hsarv...@gmail.com wrote: I don't know Alex, remember that we disconnected stuff, I'll paste the remote

Re: Scaling issue

2010-04-15 Thread Henrik Sarvell
To simply be able to pass along simple commands like collect and db ie. the *Ext stuff was overkill, which works just fine except in this special case when there are thousands of articles to a feed. I'm planning to distribute the whole DB except users and what feeds they subscribe to. Everything

Re: Scaling issue

2010-04-14 Thread Henrik Sarvell
I don't know Alex, remember that we disconnected stuff, I'll paste the remote E/R again (all of it, there is nothing else on the remotes): (class +WordCount +Entity) (rel article (+Ref +Number)) (rel word (+Aux +Ref +Number) (article)) (rel count (+Number)) The numbers here can then

Re: Scaling issue

2010-04-11 Thread Alexander Burger
Hi Henrik, (class +ArFeLink +Entity) (rel article (+Aux +Ref +Link) (feed) NIL (+Article)) (rel feed (+Ref +Link) NIL (+Feed)) (collect 'feed '+ArFeLink Obj Obj 'article) takes forever (2 mins) I need it to take something like maximum 2 seconds... Can this be fixed by adding some

Re: Scaling issue

2010-04-11 Thread Henrik Sarvell
I see, I should've known about that one (I'm using it to get similar articles already). What's additionally needed is: 1.) Calculating total count somehow without retrieving all articles. 2.) Somehow sorting by date so I get say the 25 first articles. If those two can also be achieved in a

Re: Scaling issue

2010-04-11 Thread Alexander Burger
On Sun, Apr 11, 2010 at 12:25:42PM +0200, Henrik Sarvell wrote: What's additionally needed is: 1.) Calculating total count somehow without retrieving all articles. If it is simply the count of all articles in the DB, you can get it directly from a '+Key' or '+Ref' index. I don't quite

Re: Scaling issue

2010-04-11 Thread Henrik Sarvell
Thanks Alex, I will go for the the reversed range and check out select/3. I'm already using collect with dates extensively but in this case it wouldn't work as I need the 25 newest regardless of exactly when they were published. /Henrik On Sun, Apr 11, 2010 at 1:27 PM, Alexander Burger

Re: Scaling issue

2010-04-11 Thread Alexander Burger
On Sun, Apr 11, 2010 at 02:19:23PM +0200, Henrik Sarvell wrote: Thanks Alex, I will go for the the reversed range and check out select/3. Let me mention that since picoLisp-3.0.1 we have a separate documentation of 'select/3', in doc/select.html. -- UNSUBSCRIBE: