Hi Hajo, I have been working with a similar problem: how to organize long complex text and I found that making it *inside* Pharo and program extensions to work with particular agile visualizations that are part of a data narratives is the most powerful and flexible approach, after trying Jupyter, Org, Leo Editor and others. For that, I have created Grafoscopio[1]. You can see how to install and use it and even comparisons with other similar and/or inspiring programs and the gap it's trying to fill in the ecosystem in the User Manual [2]. We have a "local first" approach, so the most updated information is in Spanish at [3][4] (except fo r the User Manual, that is almost updated and was wrote in English).
[1] http://mutabit.com/grafoscopio/index.en.html [2] http://mutabit.com/repos.fossil/grafoscopio/doc/tip/Docs/En/Books/Manual/manual.pdf [3] http://mutabit.com/grafoscopio/ [4] http://mutabit.com/dataweek/ Let me know if Grafoscopio works for you. It is my first program and the one I used to learn Pharo, so it has rookie code in many places and some remaining, but its being improved and used actively. Cheers, Offray On 22/03/18 15:43, Hajo Dezelski wrote: > Thanks Hernán, > > for the hint. I will have a look at it. > > I have a large "database" (in the moment ~ 1 GB) of notes articles in > three languages, all in plain text organized thematically in about 40 > *.org files. They have partly keywords but mostly I search for lemmata > to gather material for new articles, reorganisation of topics, etc. In > the last years I additionally added pictures belonging to the text but > could do that only in Scrivener. > > My problem is that I did not stringently applied keywords and lost the > overview where I placed text fragments concerning different topics. So > I am trying to reorganize this mess in an integrated environment where > I can search via my knowledge or NLTK functions. > > But that's my problem. I have a starting point and when I hit another > barrier I will ask more specific. > > Cheers > Hajo > Gruss > Hajo > > --- > Cela est bien dit, mais il faut cultiver notre jardin. > > http://hajos-kontrapunkte.blogspot.de/ > > > On Thu, Mar 22, 2018 at 8:35 PM, Hernán Morales Durand > <[email protected]> wrote: >> Hello Hajo, >> >> 2018-03-22 14:54 GMT-03:00 Hajo Dezelski <[email protected]>: >>> )Hello, >>> >>> I must confess that I have not RTFM totally, so when my question has >>> been asked or answered before, sorry. >>> >>> I have been using Smalltalk about 25 years ago and still have the >>> books from Goldberg and Lalonde. But during the time I watched but did >>> not actively follow the development. In the last years I switched to >>> Python using also the NLTK. >>> >>> My main problem is the organisation of information in the form of >>> lots of text objects. Here I used heavily Emacs and the org mode and >>> still my favorite: Scrivener >>> (https://www.literatureandlatte.com/scrivener/overview) >>> >>> I am still looking for an integrated environment to >>> write/organize/analyse text. And I am sure that everything is in Pharo >>> and what is missing can be programmed. >>> >> Which kind of text analysis/organization you want to do? NLP? FRBR? >> >> There are several options for text processing: >> >> There is also NaturalSmalltlak with stemmer, TF-IDF, supervised and >> unsupervised classifiers, k-means clustering, naive Bayes, etc. >> >> I didn't checked but this project >> https://github.com/mark-watson/nlp_smalltalk claims support for NER, >> POS, segmentation and summarization. >> >> There is Moose-Algos-InformationRetrieval (ex Hapax) with stemmers and >> corpus support. >> >> Maybe you can install it by evaluating: >> >> Metacello new >> configuration: 'MooseAlgos'; >> smalltalkhubUser: 'Moose' project: 'MooseAlgos'; >> version: #development; >> load: 'Moose-Tests-Algos-Graph’ >> >> >> >>> I understand that Smalltalk is an IDE, but I haven't been pointed to >>> Pharo as a standard desktop. I found Grafoscopio which seemed to me a >>> basis for the work I do, but still haven't found tools for standard >>> text processing/ file management / dictionary lookup etc. >>> >>> And I am still missing/haven't found working examples in the classes, >>> so that if you are unsure what it really stands for, I could start an >>> example and start digging. As an example until now I was not able to >>> import my org files and see what the parser does. >>> >>> So are there some documents where it is explained where to find an >>> editor, markup-tags, so that I can import my text base and can start >>> playing with my text within Pharo and use it also as a working >>> environment. >>> >> If the above doesn't fit your requirements could you comment which >> type of text do you have? >> >> Cheers, >> >> Hernán >> >>> Thanks in advance >>> >>> Hajo >>> >>> --- >>> Cela est bien dit, mais il faut cultiver notre jardin. >>> >>> http://hajos-kontrapunkte.blogspot.de/ >>> >
