Re: [libreoffice-users] Re: Creating a dictionary with libreoffice from a simple TXT-file
On 25-12-2014 17:55, Constantine wrote: On 24-12-2014 21:45, Constantine wrote: After a lot of responses how to do this in Writer, a shortnote how to do this in Calc. ;-) Open the textfile, when the 'Text import' wizzard is show do: 1) Select characterset 'Unicode (UTF-8)' 2) Separater options: 'separated by', check 'Tab' and 'Space', other options should not be checked. 3) at 'Text delimiter' type a space 4) klik 'OK' 5) Insert a column B, and fill it with a semi-colon ';' 6) Klik save-as, type a name, and check 'Edit filter settings' 7) The Export Text file' wizard should be shown. 8) Character set: 'Unicode (UTF-8)' 9) Field delimiter: space ' ' 10) Text delimiter: empty '' 11) checkboxes: only leave 'Save cell content as shown' checked. Hi Luuk, I am afraid this doesn't work. I thought of it myself and also tried it at the beginning of my work. As I said, Terms consist of 2-5 words, so when using space as separator there is no way to insert a column (especially B) for the semicolon. Besides definitions are sometimes so long with so many spaces, that calc reports not being able to create enough columns for the whole content. The fact that Terms consist of 2-5 words is not in your first post i just found it in your post from 'Wed, 24 Dec 2014 16:11:38 -0700 (MST), where you say that this is 'not important' Indeed there where terms with two words or something like this: a.D. (außer Dienst) εκτός υπηρεσίας, εν αποστρατεία where (außer Dienst) belongs to a.D. Which is not bad because (außer Dienst) in the second field is more usefull to me (us). For the fewer cases with two or more german words at the beginning, well, I think we will survive that and be able to correct it manually. The correct and professional way is what Brian suggested and I was looking for. Now I can use these expressions in the future too because the need for their usage occurs very often in mine kind of work. The correct, and professional way is certainly NOT to store these thing in a DOC, a database would be far more suited. A slight variation of my approach will also work in Calc. Because you seem not interested in this solution, i will not share it here. -- To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
Re: [libreoffice-users] Re: Creating a dictionary with libreoffice from a simple TXT-file
On 2014-12-25 07:17, Constantine wrote: Brian, you are unbelievable!!! While I solved the problem with my very sloppy trick and was writing my mail in order to inform you about it, you were looking for a correct solution and writing this very long and very very detailed answer. I am just speechless. I saved all of your instructions, not twice but three times and also printed them out. They are priceless. These expressions are things that I need very often and very badly when working on my translations and they will make my life much easier in the future. I know, that even if I had spent 30 more hours studying the manual, I would probably never have come to such clean expressions. Thank you for everything. I really hope I can do something for you in the Hi Constatine, I am a bit late in the thread to help you with this specific case (glad to see it is solved), but I'd like to suggest something. I think you wrote that you are working on a Linux system and are willing to learn. In that case one of the best things you can do to help yourself with similar problems in the future is to look into vim and especially into regular expressions too. Your replacement needs would have been something that regular expressions would have helped you with a lot. In particular things like postive/negative lookahead/lookback. If vim is not your thing, then Emacs might be more to your liking. Both can be a tremendous help with any problem that has to do with text. It will take some effort to learn vim/Emacs and regular expressions, but it will be worth every second. Grx HdV -- To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
Re: [libreoffice-users] Re: Creating a dictionary with libreoffice from a simple TXT-file
On Thu, 25 Dec 2014, hdv@gmail wrote: On 2014-12-25 07:17, Constantine wrote: Brian, you are unbelievable!!! While I solved the problem with my very sloppy trick and was writing my mail in order to inform you about it, you were looking for a correct solution and writing this very long and very very detailed answer. I am just speechless. I saved all of your instructions, not twice but three times and also printed them out. They are priceless. These expressions are things that I need very often and very badly when working on my translations and they will make my life much easier in the future. I know, that even if I had spent 30 more hours studying the manual, I would probably never have come to such clean expressions. Thank you for everything. I really hope I can do something for you in the Hi Constatine, I am a bit late in the thread to help you with this specific case (glad to see it is solved), but I'd like to suggest something. I think you wrote that you are working on a Linux system and are willing to learn. In that case one of the best things you can do to help yourself with similar problems in the future is to look into vim and especially into regular expressions too. Your replacement needs would have been something that regular expressions would have helped you with a lot. In particular things like postive/negative lookahead/lookback. If vim is not your thing, then Emacs might be more to your liking. Both can be a tremendous help with any problem that has to do with text. It will take some effort to learn vim/Emacs and regular expressions, but it will be worth every second. Grx HdV did I misread the thread? I thought the solutions Brian produced _did_ use regular expressions only within the context of LO. not a big reg exp man myself but I probably would have attempted it with 'sed' which is also an 'editor' I guess. subsequently Constantine points out he was exhausted and I very much know what it means to be stuck barking up the wrong tree! after a bit they all look the same. F. -- Felmon Davis The best way to make a fire with two sticks is to make sure one of them is a match. -- Will Rogers -- To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
Re: [libreoffice-users] Re: Creating a dictionary with libreoffice from a simple TXT-file
At 16:37 24/12/2014 -0700, Constantine Marberg wrote: My friend wants a very simple standalone form for his desktop, which uses this newly created text-file or a calc -file as dbase, to search for a word and get all the definitions where this word occurs. So, it should be a small form with 2 fields, one small entry field for the search and a much larger field (window) where the answer appears. As you suggest, it is easy to get your text into a two-column spreadsheet array. A simple way forward, if the list is not too long, is just to use the Find Replace facility again. If you search for the relevant word using Find All, all cells containing the text will be highlighted. You can scroll down to see the highlighted material. Otherwise, you probably do need a proper database. o Start a new database (Base) document. o Select Tables in the left Database column. o In your spreadsheet, select the array of material. o Drag the array into the lower Tables panel of the database window. o Follow the instructions to create a table from the imported values. o Spend part of the holiday season reading the Base documentation and learning enough to be able to create the required form! I trust this helps. Brian Barker -- To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
Re: [libreoffice-users] Re: Creating a dictionary with libreoffice from a simple TXT-file
Just a thought from what I remember of the previous posts, but will Tom's idea of searching for the left parenthesis instead of the first space not work? Paul On Wed, 24 Dec 2014 19:06:31 -0700 (MST) Constantine marber...@gmail.com wrote: Hi Brian, as you say, I will need to use base and I already started reading the docs and experimenting with the form creation. But I would also like to report on my progress. I took all the files containing German-Greek terms and pasted them in a single text-file, then using the linux editor pluma for various corrections (I am more comfortable there) I prepared it for the final phase, which of course was applying your instructions. Finally I opened the file in calc and manually corrected the entries where the German term had 2 or more words. Fortunately they weren't too many. Then I filtered all the duplicates. The result is a perfect glossary for OmegaT with, believe it or 31.400 unique entries. Now I started the same procedure for the Greek-German files but... These files contain too many greek terms consisting of 2, 3, 4 and even 5 words. Too many to deal with manually. What would you say? Is there any possible way to do the job with an expression like the one you gave me? Can you think of anything? Does it not help that they are greek characters at the beginning of the line? As far as I know in writer one can search for language, then perhaps also for characters of a certain non latin language. Combining this with an expression like the one before, it would probably work. I am not asking you to do the work for me, but I sincerely tried everything I could and ready as much as possible and still could come up with anything. I will not give up trying and reading, but since you obviously have much more knowledge of the matter as well as experience, you could save me a lot of time but also from possible errors in the resulting file. -- View this message in context: http://nabble.documentfoundation.org/Creating-a-dictionary-with-libreoffice-from-a-simple-TXT-file-tp4133988p4134001.html Sent from the Users mailing list archive at Nabble.com. -- To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
Re: [libreoffice-users] Re: Creating a dictionary with libreoffice from a simple TXT-file
At 19:06 24/12/2014 -0700, Constantine Marberg wrote: Now I started the same procedure for the Greek-German files but... These files contain too many Greek terms consisting of 2, 3, 4 and even 5 words. Too many to deal with manually. What would you say? Is there any possible way to do the job with an expression like the one you gave me? Can you think of anything? Does it not help that they are Greek characters at the beginning of the line? Yup. Try searching for ([^a-z]*) (.*) and replacing with $1;$2 or $1\t$2 as before. I am not asking you to do the work for me, ... Oh, I think you did! But no matter. ;^) I trust this helps. Brian Barker -- To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
Re: [libreoffice-users] Re: Creating a dictionary with libreoffice from a simple TXT-file
On 25/12/14 04:55, Constantine wrote: How can I avoid that? The semicolon or tab should be before the number and the parenthesis. It looks like the match is occurring on glyphs that utilize the Latin writing system, before you get to German text. Please tell me this last thing, I really don't know how to. I'm assuming this is still a text file. Segregate the material into two files. GREP probably is the easiest tool to use. File one is words that contain the left hand parenthesis. File two is words that do not contain the left hand parenthesis. For the second file, use the regex that you were using. For the first file, do the search on the left hand parenthesis. jonathon -- To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
Re: [libreoffice-users] Re: Creating a dictionary with libreoffice from a simple TXT-file
At 21:55 24/12/2014 -0700, Constantine Marberg wrote: Dear Brian, you are the greatest. Er, not quite yet, it appears! I still have a small problem. What I mean, you can see at the following example: [...] I cannot quote your example, as my under-performing mail client won't do Greek characters: sorry. But your examples include non-alphabetic characters after the Greek and before the German. The semicolon or tab should be before the number and the parenthesis. Please tell me this last thing, I really don't know how to. Two solutions: take your pick. 1. The previous expression included [^a-z] - which matches any non-alphabetic character. You need to extend this list to exclude anything else that might occur in the second part. Taking just your examples, this might include the digits 0 to 9, a dot, the paragraph mark, and parentheses. Note that some of these have meanings in regular expressions and you need to escape them in such an expression if you want them to be interpreted literally. You do this by preceding them with a backslash. So you could search for: ([^a-z0-9\.§\(\)]*) (.*) - but note that you may have to add other rogue characters that might occur in your text. 2. Better might be to approach the problem directly: searching for the Greek instead of what is not Greek (as you indeed originally suggested). I avoided this earlier because (1) I wasn't sure it would work (it does!) and (2) I knew I couldn't actually write out the required expression including Greek characters in a message. But you should be able to search for: ([a-o ]*) ([^a-o].*) - but with both those as and os actually meaning lower-case Greek alphas and omegas (so replaced with those characters), and replace as before. ... it is very important to me, for you to know, that I am not one of this lazy guys who let others do the work for them without googling or reading the documentation themselves first. I really try very hard my self first and then ask for help. Oh, I was just joshing! I didn't think otherwise. (Or I might not have cared to investigate and reply.) I hope I can do something in return for you to show you my appreciation. No need! I trust this helps. Brian Barker -- To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
Re: [libreoffice-users] Re: Creating a dictionary with libreoffice from a simple TXT-file
At 23:17 24/12/2014 -0700, Constantine Marberg wrote: you are unbelievable!!! I hope not! While I solved the problem with my very sloppy trick ... Oh, what you did - replacing a text item temporarily with a placeholder that won't occur naturally in the text in order to simplify a search - is a useful technique and not at all sloppy! I am just speechless. Don't be that: I'm guessing you may soon need to say Thank you to Santa. I know, that even if I had spent 30 more hours studying the manual, I would probably never have come to such clean expressions. I imagine you'll now be able to do such things for yourself: that's the idea, of course. Thank you for everything. No probs! Brian Barker -- To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted