Hi again Hajo, I just saw that you mention Grafoscopio in your first thread's mail and yes, you are right, we lack of the tools for text management you are looking for (dictionaries, and text processing). We have preliminary support to refer external files via node links, but if you already have such database in Org Mode Files, maybe you would like to create an importer to Grafoscopio and extend it to suit your needs. BTW, Leo Editor[1] supports importing from Org, AFAIK, and has good integration with Python, being a pure Python program, so maybe you could start there to support NLTK. That being said, the customization and live coding and visualization capabilities in the Pharo ecosystem are unbeatable when you are trying to suit your own needs.
[1] http://leoeditor.com/ Cheers, Offray On 22/03/18 17:42, Offray Vladimir Luna Cárdenas wrote: > Hi Hajo, > > I have been working with a similar problem: how to organize long complex > text and I found that making it *inside* Pharo and program extensions to > work with particular agile visualizations that are part of a data > narratives is the most powerful and flexible approach, after trying > Jupyter, Org, Leo Editor and others. For that, I have created > Grafoscopio[1]. You can see how to install and use it and even > comparisons with other similar and/or inspiring programs and the gap > it's trying to fill in the ecosystem in the User Manual [2]. We have a > "local first" approach, so the most updated information is in Spanish at > [3][4] (except fo r the User Manual, that is almost updated and was > wrote in English). > > [1] http://mutabit.com/grafoscopio/index.en.html > [2] > http://mutabit.com/repos.fossil/grafoscopio/doc/tip/Docs/En/Books/Manual/manual.pdf > [3] http://mutabit.com/grafoscopio/ > [4] http://mutabit.com/dataweek/ > > Let me know if Grafoscopio works for you. It is my first program and the > one I used to learn Pharo, so it has rookie code in many places and some > remaining, but its being improved and used actively. > > Cheers, > > Offray > > On 22/03/18 15:43, Hajo Dezelski wrote: >> Thanks Hernán, >> >> for the hint. I will have a look at it. >> >> I have a large "database" (in the moment ~ 1 GB) of notes articles in >> three languages, all in plain text organized thematically in about 40 >> *.org files. They have partly keywords but mostly I search for lemmata >> to gather material for new articles, reorganisation of topics, etc. In >> the last years I additionally added pictures belonging to the text but >> could do that only in Scrivener. >> >> My problem is that I did not stringently applied keywords and lost the >> overview where I placed text fragments concerning different topics. So >> I am trying to reorganize this mess in an integrated environment where >> I can search via my knowledge or NLTK functions. >> >> But that's my problem. I have a starting point and when I hit another >> barrier I will ask more specific. >> >> Cheers >> Hajo >> Gruss >> Hajo >> >> --- >> Cela est bien dit, mais il faut cultiver notre jardin. >> >> http://hajos-kontrapunkte.blogspot.de/ >> >> >> On Thu, Mar 22, 2018 at 8:35 PM, Hernán Morales Durand >> <[email protected]> wrote: >>> Hello Hajo, >>> >>> 2018-03-22 14:54 GMT-03:00 Hajo Dezelski <[email protected]>: >>>> )Hello, >>>> >>>> I must confess that I have not RTFM totally, so when my question has >>>> been asked or answered before, sorry. >>>> >>>> I have been using Smalltalk about 25 years ago and still have the >>>> books from Goldberg and Lalonde. But during the time I watched but did >>>> not actively follow the development. In the last years I switched to >>>> Python using also the NLTK. >>>> >>>> My main problem is the organisation of information in the form of >>>> lots of text objects. Here I used heavily Emacs and the org mode and >>>> still my favorite: Scrivener >>>> (https://www.literatureandlatte.com/scrivener/overview) >>>> >>>> I am still looking for an integrated environment to >>>> write/organize/analyse text. And I am sure that everything is in Pharo >>>> and what is missing can be programmed. >>>> >>> Which kind of text analysis/organization you want to do? NLP? FRBR? >>> >>> There are several options for text processing: >>> >>> There is also NaturalSmalltlak with stemmer, TF-IDF, supervised and >>> unsupervised classifiers, k-means clustering, naive Bayes, etc. >>> >>> I didn't checked but this project >>> https://github.com/mark-watson/nlp_smalltalk claims support for NER, >>> POS, segmentation and summarization. >>> >>> There is Moose-Algos-InformationRetrieval (ex Hapax) with stemmers and >>> corpus support. >>> >>> Maybe you can install it by evaluating: >>> >>> Metacello new >>> configuration: 'MooseAlgos'; >>> smalltalkhubUser: 'Moose' project: 'MooseAlgos'; >>> version: #development; >>> load: 'Moose-Tests-Algos-Graph’ >>> >>> >>> >>>> I understand that Smalltalk is an IDE, but I haven't been pointed to >>>> Pharo as a standard desktop. I found Grafoscopio which seemed to me a >>>> basis for the work I do, but still haven't found tools for standard >>>> text processing/ file management / dictionary lookup etc. >>>> >>>> And I am still missing/haven't found working examples in the classes, >>>> so that if you are unsure what it really stands for, I could start an >>>> example and start digging. As an example until now I was not able to >>>> import my org files and see what the parser does. >>>> >>>> So are there some documents where it is explained where to find an >>>> editor, markup-tags, so that I can import my text base and can start >>>> playing with my text within Pharo and use it also as a working >>>> environment. >>>> >>> If the above doesn't fit your requirements could you comment which >>> type of text do you have? >>> >>> Cheers, >>> >>> Hernán >>> >>>> Thanks in advance >>>> >>>> Hajo >>>> >>>> --- >>>> Cela est bien dit, mais il faut cultiver notre jardin. >>>> >>>> http://hajos-kontrapunkte.blogspot.de/ >>>> > > >
