subject:"\"Re\\\: solicting user stories of picolisp\""

Re: solicting user stories of picolisp

2010-07-21 Thread Henrik Sarvell

Thomas, you have to read http://picolisp.com/5000/-2-I.html if you want to
understand how it works completely.

And the problem is of course that it's slow (regardless of where or what)
and I don't really have the time to fix it :-)


On Wed, Jul 21, 2010 at 9:40 AM, Alexander Burger wrote:

> Hi Tomas,
>
> > > Such numbers are very variable, and difficult to predict.
> >
> > I'm not sure what you mean.  How long does a simple grep over the
> > article blob files take?  That should serve as a rough indicator about
> > worst case behaviour.
>
> I'm not talking about the timings of 'grep', but of the database.
>
> 'grep' is also subject to cache effects, but not as much as the picoLisp
> database, where each process caches all objects once they have been
> accessed. The whole query context is also cached, and related searches
> continue in the same context.
>
> The timings are also difficult to predict because they depend very much
> on the distribution of keys within the indexes, and which keys are
> queried from each index in which combination. For example, if you ask
> for a key combination that contains one or several keys that occur
> _seldom_ in the db, the matching results are found almost immediately.
> On the opposite end, searching for a combination of _common_ keys may
> require relatively long to find the exact hits.
>
> Cheers,
> - Alex
> --
> UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=unsubscribe
>

Re: solicting user stories of picolisp

2010-07-21 Thread Alexander Burger

Hi Tomas,

> > Such numbers are very variable, and difficult to predict.
> 
> I'm not sure what you mean.  How long does a simple grep over the
> article blob files take?  That should serve as a rough indicator about
> worst case behaviour.

I'm not talking about the timings of 'grep', but of the database.

'grep' is also subject to cache effects, but not as much as the picoLisp
database, where each process caches all objects once they have been
accessed. The whole query context is also cached, and related searches
continue in the same context.

The timings are also difficult to predict because they depend very much
on the distribution of keys within the indexes, and which keys are
queried from each index in which combination. For example, if you ask
for a key combination that contains one or several keys that occur
_seldom_ in the db, the matching results are found almost immediately.
On the opposite end, searching for a combination of _common_ keys may
require relatively long to find the exact hits.

Cheers,
- Alex
-- 
UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=unsubscribe

Re: solicting user stories of picolisp

2010-07-21 Thread Tomas Hlavaty

Hi Henrik,

> 1.) This is what each "remote" looks like by way of E/R:
>
> (class +WordCount +Entity)
> (rel article =C2=A0 (+Ref +Number))
> (rel word =C2=A0 =C2=A0 =C2=A0(+Aux +Ref +Number) (article))
> (rel count =C2=A0 =C2=A0 (+Number))
> (rel picoStamp (+Ref +Number))
>
> (dbs =C2=A0
> =C2=A0 (4 +WordCount)
> =C2=A0 (3 (+WordCount word article picoStamp)))
>

I can't see how this works.  In the search index I implemented was like
this:

   ("picolisp" (5 . "file1") (4 . "file2") ...)
   ("google" (3 . "file1") (2 . "file3") ...)
   ...

In your schema, I don't see how words are represented.

> The bottleneck lies somewhere else than the actual lookup,

So what is the problem then? ;-)

> search since it returns the maximum 50 where picolisp only returns 8.

Those are very long times considering there are so little results.

> So the bottleneck is not the search itself but rather badly optimized
> code that goes to work on the results later.

Hard to say from what I know.

> a way of extracting and specifying the interesting content from the h=
arvested
> feeds and links their articles point to
>
> Well the links you should be able to see in a per feed/category link map =
(I noticed
> it was broken hopefully it will work from now on) As per specifying conte=
nt through
> an Xpath what is it that you hope to gain by that? Give me a specific exa=
mple
> please.

Most feeds don't contain actual text which I'm interested in but only a
link.  That means I have to click around too much.  For example, the BBC
News http://www.bbc.co.uk/news/ feed
http://feeds.bbci.co.uk/news/rss.xml gets me onle short line and a link.
I would like to see the link directly without clicking and also I don't
want to see the whole page with all that redundant junk but only the
text of the article.  That text is inside of  so
I could specify xpath
/html/body/div[2]/div[2]/div[2]/div/div[2]/div[2]/div[2]/div and the
feed reader would automatically display just the portion of page I am
interested in.

> The main imperative for me to create the reader is the fact that the
> Google Reader's GUI is horrible IMO and I'm happy with that part of
> VizReader. That and I thought it would be an easy thing to start out
> with in PL, but there is more to a feed reader than meets the
> eye... If I had thought about making the application distributed right
> from the start I would've been even happier.

Sure, you have different motivation and way of reading news which
doesn't match with my way.  That's why I also suggested exporting a
personal feed of collected feeds or sending that stuff by email.

> In the beginning I also had an algorithm that compared articles for
> automatic recommendations of similar content, that worked for a short

That could be interesting but not something crucial I would need.

Cheers,

Tomas
-- 
UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=unsubscribe

Re: solicting user stories of picolisp

2010-07-21 Thread Tomas Hlavaty

Hi Alex,


>> if I understand it well, you have all the articles locally on one
>> machine.  I wonder how long a simple grep over the article blobs would
>> take?  22 seconds seems very long for any serious use.  Have you
>
> Such numbers are very variable, and difficult to predict.

I'm not sure what you mean.  How long does a simple grep over the
article blob files take?  That should serve as a rough indicator about
worst case behaviour.

> For example, in the system mentioned in my previous mail, with
> informations about millions of files distributed across several hosts,
> searching for a given combination of e.g. file name pattern and
> meta-informations like access times, sizes or md5 keys might take a
> few seconds at the first access, but subsequent accesses
> (i.e. continuing the search by scrolling down the list) showed almost
> no delay at all.

Hmm, I know too little about the actual system you talk about so I it's
hard to make an educated opinion on this;-) 

Cheers,

Tomas
-- 
UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=unsubscribe

Re: solicting user stories of picolisp

2010-07-20 Thread Alexander Burger

Hi Tomas,

> if I understand it well, you have all the articles locally on one
> machine.  I wonder how long a simple grep over the article blobs would
> take?  22 seconds seems very long for any serious use.  Have you

Such numbers are very variable, and difficult to predict.

For example, in the system mentioned in my previous mail, with
informations about millions of files distributed across several hosts,
searching for a given combination of e.g. file name pattern and
meta-informations like access times, sizes or md5 keys might take a few
seconds at the first access, but subsequent accesses (i.e. continuing
the search by scrolling down the list) showed almost no delay at all.

Cheers,
- Alex
-- 
UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=unsubscribe

Re: solicting user stories of picolisp

2010-07-20 Thread Henrik Sarvell

Hi Tomas.

1.) This is what each "remote" looks like by way of E/R:

(class +WordCount +Entity)
(rel article   (+Ref +Number))
(rel word  (+Aux +Ref +Number) (article))
(rel count (+Number))
(rel picoStamp (+Ref +Number))

(dbs
  (4 +WordCount)
  (3 (+WordCount word article picoStamp)))

The bottleneck lies somewhere else than the actual lookup, here are some
results I just got when probably using the application all by myself:

"picolisp" => 1.97 s
"google" => 7.22 s
"obama" => 1.64 s (cached from prior search in RAM maybe?)
"afghanistan" => 7.2 s

Note the difference between google and picolisp we the search is performed
in exactly the same way, the only difference being that the system needs to
do post work after the results have been fetched and that is more work with
the google search since it returns the maximum 50 where picolisp only
returns 8. So the bottleneck is not the search itself but rather badly
optimized code that goes to work on the results later.

a way of extracting and specifying the interesting content from the
> harvested feeds and links their articles point to
>

Well the links you should be able to see in a per feed/category link map (I
noticed it was broken hopefully it will work from now on). As per specifying
content through an Xpath what is it that you hope to gain by that? Give me a
specific example please.

The main imperative for me to create the reader is the fact that the Google
Reader's GUI is horrible IMO and I'm happy with that part of VizReader. That
and I thought it would be an easy thing to start out with in PL, but there
is more to a feed reader than meets the eye... If I had thought about making
the application distributed right from the start I would've been even
happier.

In the beginning I also had an algorithm that compared articles for
automatic recommendations of similar content, that worked for a short time.
If I were to currently apply it then it would take it roughly one year to
compare all articles with each other. At one point I only let it compare a
random subset but that resulted in (predictably) random quality too :-)

Also, a lot of the finesse of the application is lost if you're not a
Twitter user. The majority of the time I spend in it is simply checking my
flow from time to time where most of the flow consists of Twitter posts
since few "normal" feeds have implemented the pubsub protocol yet.

Cheers,
Henrik Sarvell

On Tue, Jul 20, 2010 at 7:45 PM, Tomas Hlavaty  wrote:
> Hi Henrik,
>
>> Currently vizreader.com contains roughly 350 000 articles with a full
>> word index (not partial).
>>
>> The word index is spread out on "virtual remotes" ie they are not
>> really on remote machines, it's more a way to split up the physical
>> database files on disk (I've written on how that is done on
>> picolisp.com). I have no way of knowing how many words are mapped to
>> their articles like this but most of the database is occupied by these
>> indexes and it currently occupies some 30GB all in all.
>>
>> A search for the word "Google" just took 22 seconds.
>
> if I understand it well, you have all the articles locally on one
> machine.  I wonder how long a simple grep over the article blobs would
> take?  22 seconds seems very long for any serious use.  Have you
> considered some state-of-the-art full text search engine, e.g. Lucene?
>
> Just curious, how did you create the word index?  I implemented a simple
> search functionality and word index for LogandCMS which you can try as
> http://demo.cms.logand.com/search.html?s=sheep and I even keep the count
> of every word in each page for ranking purposes but I haven't had a
> chance to run into scaling problems like that.
>
>> No other part of the application is lagging significantly except for
>> when listing new articles in my news category due to the fact that
>> there are so many articles in that category. However the fetching
>> method is highly inefficient as I first fetch all feeds in a category
>> and then all their articles and then take (tail) on them to get the 50
>> newest for instance. Walking and then only loading the wanted articles
>> to memory would of course be the best way and something I will look
>> into.
>>
>> Why don't you try out the application yourself now that you know how
>> big the database is and so on, if you use Google Reader you can just
>> export your subscriptions as an OPML and import it into VizReader.
>
> I tried it and it looks interesting.  What feature I would actually want
> from such a system is a way of extracting and specifying the interesting
> content from the harvested feeds and links their articles point to,
> e.g. using an xpath expression.  Then, either publishing it as per user
> feed or sending that as email(s) so I could use my usual mail client to
> read the news.
>
> Cheers,
>
> Tomas
> --
> UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=unsubscribe
>

Re: solicting user stories of picolisp

2010-07-20 Thread Tomas Hlavaty

Hi Henrik,

> Currently vizreader.com contains roughly 350 000 articles with a full
> word index (not partial).
>
> The word index is spread out on "virtual remotes" ie they are not
> really on remote machines, it's more a way to split up the physical
> database files on disk (I've written on how that is done on
> picolisp.com). I have no way of knowing how many words are mapped to
> their articles like this but most of the database is occupied by these
> indexes and it currently occupies some 30GB all in all.
>
> A search for the word "Google" just took 22 seconds.

if I understand it well, you have all the articles locally on one
machine.  I wonder how long a simple grep over the article blobs would
take?  22 seconds seems very long for any serious use.  Have you
considered some state-of-the-art full text search engine, e.g. Lucene?

Just curious, how did you create the word index?  I implemented a simple
search functionality and word index for LogandCMS which you can try as
http://demo.cms.logand.com/search.html?s=sheep and I even keep the count
of every word in each page for ranking purposes but I haven't had a
chance to run into scaling problems like that.

> No other part of the application is lagging significantly except for
> when listing new articles in my news category due to the fact that
> there are so many articles in that category. However the fetching
> method is highly inefficient as I first fetch all feeds in a category
> and then all their articles and then take (tail) on them to get the 50
> newest for instance. Walking and then only loading the wanted articles
> to memory would of course be the best way and something I will look
> into.
>
> Why don't you try out the application yourself now that you know how
> big the database is and so on, if you use Google Reader you can just
> export your subscriptions as an OPML and import it into VizReader.

I tried it and it looks interesting.  What feature I would actually want
from such a system is a way of extracting and specifying the interesting
content from the harvested feeds and links their articles point to,
e.g. using an xpath expression.  Then, either publishing it as per user
feed or sending that as email(s) so I could use my usual mail client to
read the news.

Cheers,

Tomas
-- 
UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=unsubscribe

Re: solicting user stories of picolisp

2010-07-19 Thread Peter Fischer


On 19.07.2010 18:46, Alexander Burger wrote:

On Mon, Jul 19, 2010 at 04:39:08PM +0200, Mateusz Jan Przybylski wrote:
>  ``So this Lisp is a newfangled language, quite like Ruby, right?''
   
I'm deeply shocked!
   
I'm not surprised. In 2010, people like wrapping yet another library in 
yet another framework. Until "the solution(tm)" is about 47 MB (=mega 
bloat) big - minimum. RAM and Disk is cheap nowadays... Programmers are 
admired for more LoC, not for less.


Another point may be the orientation of educational entities towards 
certain "industry standards" and the vendors "academical pricing".


Peter

P.S.: even less people have heard of Forth.

--
UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=unsubscribe

Re: solicting user stories of picolisp

2010-07-19 Thread José Romero

El Mon, 19 Jul 2010 18:46:55 +0200
Alexander Burger  escribi=C3=B3:
> On Mon, Jul 19, 2010 at 04:39:08PM +0200, Mateusz Jan Przybylski
> wrote:
> > The lecturer never heard of Lisp before; after listening to my
> > explanations he wrapped it up with:
> >  ``So this Lisp is a newfangled language, quite like Ruby, right?''
> > Geez...
>=20
> I'm deeply shocked!

Lisp Never gets old
-- 
UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=unsubscribe

Re: solicting user stories of picolisp

2010-07-19 Thread Edwin Eyan Moragas

Hi Alex,

thank you for these. very comforting. and thank you for picolisp!

my thanks also to Mateusz and Henrik.

guys, keep em coming! i am deeply enjoying this. i hope the rest are too.

On Tue, Jul 20, 2010 at 12:46 AM, Alexander Burger  wrote:
> Hi Edwin,
>
>> if anybody would be so kind to share how they have experienced running
>> picolisp in production. fine, not just stories, but also numbers. how
>
> Since we are using PicoLisp in production since 1986, I could perhaps
> tell a lot if I should remember it all. Concerning numbers, we have
> several customers running many years. Our oldest customer using the
> current system has the system running since January 2001 without
> interruption. The database of that customer is not very big, though (430
> Megabytes, 277723 objects).
>
>
>> big have your databases grown? how fast has the picolisp appserver
>
> The biggest databases we had for another project, for systems indexing
> and classifying filer systems of big customers (I should not tell names
> here). There we had distributed databases (up to 70 interconnected
> databases) with nearly one billion objects. The larger databases within
> such a system were around 100-200 GB, more typical was around 20-80 GB.
>
>
>> delivered your queries? did you ever get to see the picolisp database
>
> I have never directly measured that speed, that wasn't an issue as all
> those apps were not oriented for especially many clients. In this
> context perhaps the results of the database contest in the german c't
> magazine (http://www.heise.de/kiosk/archiv/ct/2006/13/190) are relevant,
> where PicoLisp on the second price.
>
>
>> recover from unforeseen system errors like crashes from the operating
>> system and so?
>
> Fortunately, not yet. We tested such situations, however (pulling the
> plug), and normal power outages happened from time to time whithout any
> data loss so far.
>
>
>> can you please share your stories? would love to hear them.
>
> I'm afraid I'm not a good story-teller, so I hope the above fragments
> are useful ;-)
>
> Cheers,
> - Alex
> --
> UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=unsubscribe
>
-- 
UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=unsubscribe

Re: solicting user stories of picolisp

2010-07-19 Thread Edwin Eyan Moragas

On Mon, Jul 19, 2010 at 10:39 PM, Mateusz Jan Przybylski
 wrote:
> However, a (quick'n'dirty) HTML & HTTP application in PicoLisp got me a v=
ery
> good grade for `Programming languages & paradigms' course at Uni.
>
> The lecturer never heard of Lisp before; after listening to my explanatio=
ns he
> wrapped it up with:
> =A0``So this Lisp is a newfangled language, quite like Ruby, right?''
> Geez...

i really hope you were kidding.

>
>
> --
> Mateusz Jan Przybylski
>
>
> ``One can't proceed from the informal to the formal by formal means.''
> --
> UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=3dunsubscribe
>
-- 
UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=unsubscribe

Re: solicting user stories of picolisp

2010-07-19 Thread Alexander Burger

Hi Edwin,

> if anybody would be so kind to share how they have experienced running
> picolisp in production. fine, not just stories, but also numbers. how

Since we are using PicoLisp in production since 1986, I could perhaps
tell a lot if I should remember it all. Concerning numbers, we have
several customers running many years. Our oldest customer using the
current system has the system running since January 2001 without
interruption. The database of that customer is not very big, though (430
Megabytes, 277723 objects).


> big have your databases grown? how fast has the picolisp appserver

The biggest databases we had for another project, for systems indexing
and classifying filer systems of big customers (I should not tell names
here). There we had distributed databases (up to 70 interconnected
databases) with nearly one billion objects. The larger databases within
such a system were around 100-200 GB, more typical was around 20-80 GB.


> delivered your queries? did you ever get to see the picolisp database

I have never directly measured that speed, that wasn't an issue as all
those apps were not oriented for especially many clients. In this
context perhaps the results of the database contest in the german c't
magazine (http://www.heise.de/kiosk/archiv/ct/2006/13/190) are relevant,
where PicoLisp on the second price.


> recover from unforeseen system errors like crashes from the operating
> system and so?

Fortunately, not yet. We tested such situations, however (pulling the
plug), and normal power outages happened from time to time whithout any
data loss so far.


> can you please share your stories? would love to hear them.

I'm afraid I'm not a good story-teller, so I hope the above fragments
are useful ;-)

Cheers,
- Alex
-- 
UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=unsubscribe

Re: solicting user stories of picolisp

2010-07-19 Thread Alexander Burger

On Mon, Jul 19, 2010 at 04:39:08PM +0200, Mateusz Jan Przybylski wrote:
> The lecturer never heard of Lisp before; after listening to my explanations 
> he 
> wrapped it up with:
>  ``So this Lisp is a newfangled language, quite like Ruby, right?''
> Geez...

I'm deeply shocked!
-- 
UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=unsubscribe

Re: solicting user stories of picolisp

2010-07-19 Thread Henrik Sarvell

Currently vizreader.com contains roughly 350 000 articles with a full
word index (not partial).

The word index is spread out on "virtual remotes" ie they are not
really on remote machines, it's more a way to split up the physical
database files on disk (I've written on how that is done on
picolisp.com). I have no way of knowing how many words are mapped to
their articles like this but most of the database is occupied by these
indexes and it currently occupies some 30GB all in all.

A search for the word "Google" just took 22 seconds.

No other part of the application is lagging significantly except for
when listing new articles in my news category due to the fact that
there are so many articles in that category. However the fetching
method is highly inefficient as I first fetch all feeds in a category
and then all their articles and then take (tail) on them to get the 50
newest for instance. Walking and then only loading the wanted articles
to memory would of course be the best way and something I will look
into.

Why don't you try out the application yourself now that you know how
big the database is and so on, if you use Google Reader you can just
export your subscriptions as an OPML and import it into VizReader.

Cheers,
Henrik Sarvell

On Mon, Jul 19, 2010 at 4:39 PM, Mateusz Jan Przybylski
 wrote:
> On Monday 19 July 2010 16:23:27 you wrote:
>> if anybody would be so kind to share how they have experienced running
>> picolisp in production.
>
> None yet, unfortunately.
>
> However, a (quick'n'dirty) HTML & HTTP application in PicoLisp got me a v=
ery
> good grade for `Programming languages & paradigms' course at Uni.
>
> The lecturer never heard of Lisp before; after listening to my explanatio=
ns he
> wrapped it up with:
> =A0``So this Lisp is a newfangled language, quite like Ruby, right?''
> Geez...
>
>
> --
> Mateusz Jan Przybylski
>
>
> ``One can't proceed from the informal to the formal by formal means.''
> --
> UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=3dunsubscribe
>
-- 
UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=unsubscribe

Re: solicting user stories of picolisp

2010-07-19 Thread Mateusz Jan Przybylski

On Monday 19 July 2010 16:23:27 you wrote:
> if anybody would be so kind to share how they have experienced running
> picolisp in production.

None yet, unfortunately.

However, a (quick'n'dirty) HTML & HTTP application in PicoLisp got me a very 
good grade for `Programming languages & paradigms' course at Uni.

The lecturer never heard of Lisp before; after listening to my explanations he 
wrapped it up with:
 ``So this Lisp is a newfangled language, quite like Ruby, right?''
Geez...

-- 
Mateusz Jan Przybylski

``One can't proceed from the informal to the formal by formal means.''
-- 
UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=unsubscribe

Re: solicting user stories of picolisp

Re: solicting user stories of picolisp

Re: solicting user stories of picolisp

Re: solicting user stories of picolisp

Re: solicting user stories of picolisp

Re: solicting user stories of picolisp

Re: solicting user stories of picolisp

Re: solicting user stories of picolisp

Re: solicting user stories of picolisp

Re: solicting user stories of picolisp

Re: solicting user stories of picolisp

Re: solicting user stories of picolisp

Re: solicting user stories of picolisp

Re: solicting user stories of picolisp

Re: solicting user stories of picolisp

15 matches

Site Navigation

Mail list logo

Footer information