Re: The database as a "dictionary"?

2015-11-21 Thread Pierpaolo Bernardi
On Sat, Nov 21, 2015 at 10:37 PM, Denis Fourt  wrote:
> B-Tree means binary tree or balance tree?

Nobody knows the meaning of the B.

https://en.wikipedia.org/wiki/B-tree
-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe


RE: The database as a "dictionary"?

2015-11-21 Thread Denis Fourt
Thanks for the information. I shall look deeper into the documentation and the 
code, then I should try it. B-Tree means binary tree or balance tree?

Denis


> From: a...@software-lab.de
> To: picolisp@software-lab.de
> Subject: Re: The database as a "dictionary"?
> Date: Sat, 21 Nov 2015 13:04:46 +0100
>
> Hi Denis,
>
>> I have been working on some robust algorithms for text summarization and 
>> matching. A very approximate (and misleading, but this is not important now) 
>> description is :
>>
>> a) turn documents into lists of sentences
>> b) turn sentences into lists of words
>> c) estimate words statistics
>> d) turn sentences into lists of features, represented by numerical ids
>> c) compare sentences
>>
>> For this purpose, dictionaries which keys are strings are used (implemented 
>> with tries). I have recently begun to study the database and have been 
>> wondering it would not be better to used it for that purpose. The idea would 
>> be :
>>
>> a database containing document, sentences, words, features, ...
>>
>> And it would be possible to get the number of occurrences of one word in all 
>> or one document, or the sentences which contain a certain word or feature 
>> for example.
>>
>> Does this sound reasonable?
>
> Yes, it does. Using the database has two advantages: (1) You get
> persistence of your data, and (2) it will automatically use B-Tree
> indexes.
>
>
>> Is the database fast enough?
>
> Yes. The PicoLisp DB works in such a way that all objects once fetched
> from the DB files are cached in memory, so that further operations run
> at full speed.
>
>
>> Is it possible to automatically propagate some information within the
>> database? For example, when a word is read, its occurrence number have
>> to be incremented, but also the occurrences of its related features.
>
> Yes, this is what the entity/relation daemons in the database are all
> about. For example, each class of objects maintains its private count,
> and each index tree too. In addition, you can define an 'upd>' method
> for an entity class which fires when an object is modified.
>
> ♪♫ Alex
> --
> UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
  

RE: questions about the gui

2015-11-21 Thread Denis Fourt
Yes, this helps, I did not understand that psh was a debugging tool. About 
(app), I still would like to know, if there was a way to run more than one 
application on the same computer and be sure that they would not by accident 
decide to use the same port. I understand that within one app there is no 
problem (as long as there are not too many users), but with two or more? Or do 
my network programming memories need a refresh? 

Denis

> From: johtob...@gmail.com 
> To: picolisp@software-lab.de 
> Subject: Re: questions about the gui 
> Date: Sat, 21 Nov 2015 08:51:43 +0100 
> 
> 
> Hello, 
> 
> It is a local shell that connects to a local webserver. 
> It can be used for webdevelopment, you can expect variables and so on. 
> (app) open a new process with a new port. So each process and each 
> single user linked to it can has a own state. The user can connects to 
> his new port 
> Does it help? 
> 
> Am 21.11.2015 08:28 schrieb "Denis Fourt" 
> >: 
> Hello, 
> After reading the gui documentation, I would like some help on the 
> following topics, please : 
> a) I am not sure to understand the purpose and the uses of the psh function 
  --
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe


Re: questions about the gui

2015-11-21 Thread Alexander Burger
Hi Denis,

in addition to what Joh-Tob said, let me correct some issues about
'app'.

> b) I understand that calling the (app) function allows multiple users to 
> access
> an application at the same time, which makes web apps and collaborative 
> software
> possible

What (app) really does is establishing a "session".

When a client (browser) connects to the server, the server forks a child
process which sends a response to the request. This is typically a GET
request, and the child sends a HTML page to the client. At the same
time, the server parent process continues to listen for further
requests.

Now, when (app) is NOT called in the child while it generates its
response (i.e. it sends a static page), then the child process
terminates.

If, however, (app) is called, the child does not terminate. It allocates
a new port to listen for further requests from that client, allows
login, keeps the session's state, and so on.

So multi-user access to the application is also possible without (app),
but each request will be answered in a fire-and-forget style.


> a process listening on different port is created for each user isn't it?

right, this is what is happening.


> So how do you avoid conflict when running independent applications?

To have more than one application running on a single machine, you start
several server parent processes. Each of them will be independently
listening on its own port. We use 'httpGate' as a port proxy, so that
from the browser's view the port is always 80 (HTTP) or 443 (HTTPS), but
is relayed on the server to the right port (and thus server process).


> In case of single user desktop apps, not calling (app) and reserving a port
> seems sufficient. But in case of several users?

So, as you see, (app) has nothing to do with single- or multi-user.
Also, a single application will need (app) to allow sessions, and this
in turn has nothing to do with how many users access this application.

♪♫ Alex
-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe


Re: questions about the gui

2015-11-21 Thread Joh-Tob Schäg
I assume you fear that a port gets binded to a socket twice.
That can not happen, since unix/linux does not allow two processes to bind
the same process.
(app) calls the pil function (port).
The port function itself is able to find a unused port an binds it. For
more details read (doc 'port)
If you are good with C you might want to look inside net.c and understand
doport.
Picolisp takes care of the ports for you.

2015-11-21 9:36 GMT+01:00 Denis Fourt :

> Yes, this helps, I did not understand that psh was a debugging tool. About
> (app), I still would like to know, if there was a way to run more than one
> application on the same computer and be sure that they would not by
> accident decide to use the same port. I understand that within one app
> there is no problem (as long as there are not too many users), but with two
> or more? Or do my network programming memories need a refresh?
>
> Denis
> 
> > From: johtob...@gmail.com
> > To: picolisp@software-lab.de
> > Subject: Re: questions about the gui
> > Date: Sat, 21 Nov 2015 08:51:43 +0100
> >
> >
> > Hello,
> >
> > It is a local shell that connects to a local webserver.
> > It can be used for webdevelopment, you can expect variables and so on.
> > (app) open a new process with a new port. So each process and each
> > single user linked to it can has a own state. The user can connects to
> > his new port
> > Does it help?
> >
> > Am 21.11.2015 08:28 schrieb "Denis Fourt"
> > >:
> > Hello,
> > After reading the gui documentation, I would like some help on the
> > following topics, please :
> > a) I am not sure to understand the purpose and the uses of the psh
> function
>   --
> UNSUBSCRIBE: mailto:picolisp@software-lab.de?subjectUnsubscribe
>


Re: The database as a "dictionary"?

2015-11-21 Thread Alexander Burger
Hi Denis,

> I have been working on some robust algorithms for text summarization and 
> matching. A very approximate (and misleading, but this is not important now) 
> description is :
> 
> a) turn documents into lists of sentences
> b) turn sentences into lists of words
> c) estimate words statistics
> d) turn sentences into lists of features, represented by numerical ids
> c) compare sentences
> 
> For this purpose, dictionaries which keys are strings are used (implemented 
> with tries). I have recently begun to study the database and have been 
> wondering it would not be better to used it for that purpose. The idea would 
> be :
> 
> a database containing document, sentences, words, features, ...
> 
> And it would be possible to get the number of occurrences of one word in all 
> or one document, or the sentences which contain a certain word or feature for 
> example.
> 
> Does this sound reasonable?

Yes, it does. Using the database has two advantages: (1) You get
persistence of your data, and (2) it will automatically use B-Tree
indexes.


> Is the database fast enough?

Yes. The PicoLisp DB works in such a way that all objects once fetched
from the DB files are cached in memory, so that further operations run
at full speed.


> Is it possible to automatically propagate some information within the
> database? For example, when a word is read, its occurrence number have
> to be incremented, but also the occurrences of its related features.

Yes, this is what the entity/relation daemons in the database are all
about. For example, each class of objects maintains its private count,
and each index tree too. In addition, you can define an 'upd>' method
for an entity class which fires when an object is modified.

♪♫ Alex
-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe


The database as a "dictionary"?

2015-11-21 Thread Denis Fourt
Hello,

I have been working on some robust algorithms for text summarization and 
matching. A very approximate (and misleading, but this is not important now) 
description is :

a) turn documents into lists of sentences
b) turn sentences into lists of words
c) estimate words statistics
d) turn sentences into lists of features, represented by numerical ids
c) compare sentences

For this purpose, dictionaries which keys are strings are used (implemented 
with tries). I have recently begun to study the database and have been 
wondering it would not be better to used it for that purpose. The idea would be 
:

a database containing document, sentences, words, features, ...

And it would be possible to get the number of occurrences of one word in all or 
one document, or the sentences which contain a certain word or feature for 
example.

Does this sound reasonable?

Is the database fast enough? The basic statistic is word occurrence, which 
means that each word encountered would have to be searched within the external 
symbols : finding the value associated to a string in a trie remains fast 
whatever the numbers of keys. 

Is it possible to automatically propagate some information within the database? 
For example, when a word is read, its occurrence number have to be incremented, 
but also the occurrences of its related features.

Numerical ids have been only used for speed reasons, determining the equality 
of two numbers is much faster than for two strings. This makes nevertheless 
debugging unpleasant. What about comparing two external symbols?

Thanks

Denis




  --
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe