Thanks Hernán,

for the hint. I will have a look at it.

I have a large "database"  (in the moment ~ 1 GB) of notes articles in
three languages, all in plain text organized thematically  in about 40
*.org files. They have partly keywords but mostly I search for lemmata
to gather material for new articles, reorganisation of topics, etc. In
the last years I additionally added pictures belonging to the text but
could do that only in Scrivener.

My problem is that I did not stringently applied keywords and lost the
overview where I placed text fragments concerning different topics. So
I am trying to reorganize this mess in an integrated environment where
I can search via my knowledge or NLTK functions.

But that's my problem. I have a starting point and when I hit another
barrier I will ask more specific.

Cheers
Hajo
Gruss
Hajo

---
Cela est bien dit, mais il faut cultiver notre jardin.

http://hajos-kontrapunkte.blogspot.de/


On Thu, Mar 22, 2018 at 8:35 PM, Hernán Morales Durand
<hernan.mora...@gmail.com> wrote:
> Hello Hajo,
>
> 2018-03-22 14:54 GMT-03:00 Hajo Dezelski <dl1...@gmail.com>:
>> )Hello,
>>
>> I must confess that I have not RTFM totally, so when my question has
>> been asked or answered before, sorry.
>>
>> I have been using Smalltalk about 25 years ago and still have the
>> books from Goldberg and Lalonde. But during the time I watched but did
>> not actively follow the development. In the last years I switched to
>> Python using also the NLTK.
>>
>> My main problem is the organisation of information  in the form of
>> lots of text objects. Here I used heavily Emacs and the org mode and
>> still my favorite: Scrivener
>> (https://www.literatureandlatte.com/scrivener/overview)
>>
>> I am still looking for an integrated environment to
>> write/organize/analyse text. And I am sure that everything is in Pharo
>> and what is missing can be programmed.
>>
>
> Which kind of text analysis/organization you want to do? NLP? FRBR?
>
> There are several options for text processing:
>
> There is also NaturalSmalltlak with stemmer, TF-IDF, supervised and
> unsupervised classifiers, k-means clustering, naive Bayes, etc.
>
> I didn't checked but this project
> https://github.com/mark-watson/nlp_smalltalk claims support for NER,
> POS, segmentation and summarization.
>
> There is Moose-Algos-InformationRetrieval (ex Hapax) with stemmers and
> corpus support.
>
> Maybe you can install it by evaluating:
>
> Metacello new
>         configuration: 'MooseAlgos';
>         smalltalkhubUser: 'Moose' project: 'MooseAlgos';
>         version: #development;
>         load: 'Moose-Tests-Algos-Graph’
>
>
>
>> I understand that Smalltalk is an IDE, but I haven't been pointed to
>> Pharo as a standard desktop. I found Grafoscopio which seemed to me a
>> basis for the work I do, but still haven't found tools for standard
>> text processing/ file management / dictionary lookup etc.
>>
>> And I am still missing/haven't found working examples in the classes,
>> so that if you are unsure what it really stands for, I could start an
>> example and start digging. As an example until now I was not able to
>> import my org files and see what the parser does.
>>
>> So are there some documents where it is explained where to find an
>> editor, markup-tags, so that I can import my text base and can start
>> playing with my text within Pharo and use it also as a working
>> environment.
>>
>
> If the above doesn't fit your requirements could you comment which
> type of text do you have?
>
> Cheers,
>
> Hernán
>
>> Thanks in advance
>>
>> Hajo
>>
>> ---
>> Cela est bien dit, mais il faut cultiver notre jardin.
>>
>> http://hajos-kontrapunkte.blogspot.de/
>>
>

Reply via email to