Thanks Hernán, for the hint. I will have a look at it.
I have a large "database" (in the moment ~ 1 GB) of notes articles in three languages, all in plain text organized thematically in about 40 *.org files. They have partly keywords but mostly I search for lemmata to gather material for new articles, reorganisation of topics, etc. In the last years I additionally added pictures belonging to the text but could do that only in Scrivener. My problem is that I did not stringently applied keywords and lost the overview where I placed text fragments concerning different topics. So I am trying to reorganize this mess in an integrated environment where I can search via my knowledge or NLTK functions. But that's my problem. I have a starting point and when I hit another barrier I will ask more specific. Cheers Hajo Gruss Hajo --- Cela est bien dit, mais il faut cultiver notre jardin. http://hajos-kontrapunkte.blogspot.de/ On Thu, Mar 22, 2018 at 8:35 PM, Hernán Morales Durand <hernan.mora...@gmail.com> wrote: > Hello Hajo, > > 2018-03-22 14:54 GMT-03:00 Hajo Dezelski <dl1...@gmail.com>: >> )Hello, >> >> I must confess that I have not RTFM totally, so when my question has >> been asked or answered before, sorry. >> >> I have been using Smalltalk about 25 years ago and still have the >> books from Goldberg and Lalonde. But during the time I watched but did >> not actively follow the development. In the last years I switched to >> Python using also the NLTK. >> >> My main problem is the organisation of information in the form of >> lots of text objects. Here I used heavily Emacs and the org mode and >> still my favorite: Scrivener >> (https://www.literatureandlatte.com/scrivener/overview) >> >> I am still looking for an integrated environment to >> write/organize/analyse text. And I am sure that everything is in Pharo >> and what is missing can be programmed. >> > > Which kind of text analysis/organization you want to do? NLP? FRBR? > > There are several options for text processing: > > There is also NaturalSmalltlak with stemmer, TF-IDF, supervised and > unsupervised classifiers, k-means clustering, naive Bayes, etc. > > I didn't checked but this project > https://github.com/mark-watson/nlp_smalltalk claims support for NER, > POS, segmentation and summarization. > > There is Moose-Algos-InformationRetrieval (ex Hapax) with stemmers and > corpus support. > > Maybe you can install it by evaluating: > > Metacello new > configuration: 'MooseAlgos'; > smalltalkhubUser: 'Moose' project: 'MooseAlgos'; > version: #development; > load: 'Moose-Tests-Algos-Graph’ > > > >> I understand that Smalltalk is an IDE, but I haven't been pointed to >> Pharo as a standard desktop. I found Grafoscopio which seemed to me a >> basis for the work I do, but still haven't found tools for standard >> text processing/ file management / dictionary lookup etc. >> >> And I am still missing/haven't found working examples in the classes, >> so that if you are unsure what it really stands for, I could start an >> example and start digging. As an example until now I was not able to >> import my org files and see what the parser does. >> >> So are there some documents where it is explained where to find an >> editor, markup-tags, so that I can import my text base and can start >> playing with my text within Pharo and use it also as a working >> environment. >> > > If the above doesn't fit your requirements could you comment which > type of text do you have? > > Cheers, > > Hernán > >> Thanks in advance >> >> Hajo >> >> --- >> Cela est bien dit, mais il faut cultiver notre jardin. >> >> http://hajos-kontrapunkte.blogspot.de/ >> >