Re: [libreoffice-users] Re: Creating a dictionary with libreoffice from a simple TXT-file

2014-12-26 Thread Luuk

On 25-12-2014 17:55, Constantine wrote:


On 24-12-2014 21:45, Constantine wrote:

After a lot of responses how to do this in Writer,
a shortnote how to do this in Calc. ;-)

Open the textfile, when the 'Text import' wizzard is show do:
1) Select characterset 'Unicode (UTF-8)'
2) Separater options: 'separated by', check 'Tab' and 'Space', other
options should not be checked.
3) at 'Text delimiter' type a space
4) klik 'OK'

5) Insert a column B, and fill it with a semi-colon ';'

6) Klik save-as, type a name, and check 'Edit filter settings'
7) The Export Text file' wizard should be shown.
8) Character set: 'Unicode (UTF-8)'
9) Field delimiter: space ' '
10) Text delimiter: empty ''
11) checkboxes: only leave 'Save cell content as shown' checked.


Hi Luuk,

I am afraid this doesn't work. I thought of it myself and also tried it at
the beginning of my work.
As I said, Terms consist of 2-5 words, so when using space as separator
there is no way to insert a column (especially B) for the semicolon. Besides
definitions are sometimes so long with so many spaces, that calc reports not
being able to create enough columns for the whole content.



The fact that Terms consist of 2-5 words is not in your first post

i just found it in your post from 'Wed, 24 Dec 2014 16:11:38 -0700 
(MST), where you say that this is 'not important'

Indeed there where terms with two words or something like this:

a.D. (außer Dienst) εκτός υπηρεσίας, εν αποστρατεία

where (außer Dienst) belongs to a.D.
Which is not bad because (außer Dienst) in the second field is more
usefull to me (us).

For the fewer cases with two or more german words at the beginning, well, I
think we will survive that and be able to correct it manually.





The correct and professional way is what Brian suggested and I was looking
for.
Now I can use these expressions in the future too because the need for their
usage occurs very often in mine kind of work.




The correct, and professional way is certainly NOT to store these thing 
in a DOC, a database would be far more suited.


A slight variation of my approach will also work in Calc. Because you 
seem not interested in this solution, i will not share it here.




--
To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted


Re: [libreoffice-users] Re: Creating a dictionary with libreoffice from a simple TXT-file

2014-12-25 Thread hdv@gmail
On 2014-12-25 07:17, Constantine wrote:
 Brian,
 
 you are unbelievable!!!
 
 While I solved the problem with my very sloppy trick and was writing 
 my mail in order to inform you about it, you were looking for a 
 correct solution and writing this very long and very very detailed 
 answer.
 
 I am just speechless.
 
 I saved all of your instructions, not twice but three times and also 
 printed them out. They are priceless.
 
 These expressions are things that I need very often and very badly 
 when working on my translations and they will make my life much 
 easier in the future. I know, that even if I had spent 30 more hours 
 studying the manual, I would probably never have come to such clean 
 expressions.
 
 Thank you for everything. I really hope I can do something for you
 in the

Hi Constatine,

I am a bit late in the thread to help you with this specific case (glad
to see it is solved), but I'd like to suggest something. I think you
wrote that you are working on a Linux system and are willing to learn.
In that case one of the best things you can do to help yourself with
similar problems in the future is to look into vim and especially into
regular expressions too. Your replacement needs would have been
something that regular expressions would have helped you with a lot. In
particular things like postive/negative lookahead/lookback. If vim is
not your thing, then Emacs might be more to your liking. Both can be a
tremendous help with any problem that has to do with text. It will take
some effort to learn vim/Emacs and regular expressions, but it will be
worth every second.

Grx HdV

-- 
To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted



Re: [libreoffice-users] Re: Creating a dictionary with libreoffice from a simple TXT-file

2014-12-25 Thread Felmon Davis

On Thu, 25 Dec 2014, hdv@gmail wrote:


On 2014-12-25 07:17, Constantine wrote:

Brian,

you are unbelievable!!!

While I solved the problem with my very sloppy trick and was writing
my mail in order to inform you about it, you were looking for a
correct solution and writing this very long and very very detailed
answer.

I am just speechless.

I saved all of your instructions, not twice but three times and also
printed them out. They are priceless.

These expressions are things that I need very often and very badly
when working on my translations and they will make my life much
easier in the future. I know, that even if I had spent 30 more hours
studying the manual, I would probably never have come to such clean
expressions.

Thank you for everything. I really hope I can do something for you
in the


Hi Constatine,

I am a bit late in the thread to help you with this specific case (glad
to see it is solved), but I'd like to suggest something. I think you
wrote that you are working on a Linux system and are willing to learn.
In that case one of the best things you can do to help yourself with
similar problems in the future is to look into vim and especially into
regular expressions too. Your replacement needs would have been
something that regular expressions would have helped you with a lot. In
particular things like postive/negative lookahead/lookback. If vim is
not your thing, then Emacs might be more to your liking. Both can be a
tremendous help with any problem that has to do with text. It will take
some effort to learn vim/Emacs and regular expressions, but it will be
worth every second.

Grx HdV




did I misread the thread? I thought the solutions Brian produced _did_ 
use regular expressions only within the context of LO.


not a big reg exp man myself but I probably would have attempted it 
with 'sed' which is also an 'editor' I guess.


subsequently Constantine points out he was exhausted and I very much 
know what it means to be stuck barking up the wrong tree! after a bit 
they all look the same.


F.

--
Felmon Davis

The best way to make a fire with two sticks is to make sure one of 
them is a match.

-- Will Rogers


--
To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted



Re: [libreoffice-users] Re: Creating a dictionary with libreoffice from a simple TXT-file

2014-12-24 Thread Brian Barker

At 16:37 24/12/2014 -0700, Constantine Marberg wrote:
My friend wants a very simple standalone form for his desktop, which 
uses this newly created text-file or a calc -file as dbase, to 
search for a word and get all the definitions where this word 
occurs. So, it should be a small form with 2 fields, one small entry 
field for the search and a much larger field (window) where the answer appears.


As you suggest, it is easy to get your text into a two-column 
spreadsheet array. A simple way forward, if the list is not too long, 
is just to use the Find  Replace facility again. If you search for 
the relevant word using Find All, all cells containing the text will 
be highlighted. You can scroll down to see the highlighted material.


Otherwise, you probably do need a proper database.
o Start a new database (Base) document.
o Select Tables in the left Database column.
o In your spreadsheet, select the array of material.
o Drag the array into the lower Tables panel of the database window.
o Follow the instructions to create a table from the imported values.
o Spend part of the holiday season reading the Base documentation and 
learning enough to be able to create the required form!


I trust this helps.

Brian Barker


--
To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted



Re: [libreoffice-users] Re: Creating a dictionary with libreoffice from a simple TXT-file

2014-12-24 Thread Paul
Just a thought from what I remember of the previous posts, but will
Tom's idea of searching for the left parenthesis instead of the first
space not work?


Paul



On Wed, 24 Dec 2014 19:06:31 -0700 (MST)
Constantine marber...@gmail.com wrote:

 Hi Brian,
 
 as you say, I will need to use base and I already started reading the
 docs and experimenting with the form creation.
 
 But I would also like to report on my progress.
 I took all the files containing German-Greek terms and pasted them in
 a single text-file, then using the linux editor pluma for various
 corrections (I am more comfortable there) I prepared it for the final
 phase, which of course was applying your instructions.
 Finally I opened the file in calc and manually corrected the entries
 where the German term had 2 or more words. Fortunately they weren't
 too many. Then I filtered all the duplicates.
 The result is a perfect glossary for OmegaT with, believe it or 31.400
 unique entries.
 
 Now I started the same procedure for the Greek-German files but...
 These files contain too many greek terms consisting of 2, 3, 4 and
 even 5 words. Too many to deal with manually.
 
 What would you say? Is there any possible way to do the job with an
 expression like the one you gave me?
 Can you think of anything? Does it not help that they are greek
 characters at the beginning of the line?
 As far as I know in writer one can search for language, then perhaps
 also for characters of a certain non latin language.
 Combining this with an expression like the one before, it would
 probably work.
 
 I am not asking you to do the work for me, but I sincerely tried
 everything I could and ready as much as possible and still could come
 up with anything. I will not give up trying and reading, but since
 you obviously have much more knowledge of the matter as well as
 experience, you could save me a lot of time but also from possible
 errors in the resulting file.
 
 
 
 
 --
 View this message in context:
 http://nabble.documentfoundation.org/Creating-a-dictionary-with-libreoffice-from-a-simple-TXT-file-tp4133988p4134001.html
 Sent from the Users mailing list archive at Nabble.com.
 


-- 
To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted



Re: [libreoffice-users] Re: Creating a dictionary with libreoffice from a simple TXT-file

2014-12-24 Thread Brian Barker

At 19:06 24/12/2014 -0700, Constantine Marberg wrote:
Now I started the same procedure for the Greek-German files but... 
These files contain too many Greek terms consisting of 2, 3, 4 and 
even 5 words. Too many to deal with manually. What would you say? Is 
there any possible way to do the job with an expression like the one 
you gave me? Can you think of anything? Does it not help that they 
are Greek characters at the beginning of the line?


Yup. Try searching for
([^a-z]*) (.*)
and replacing with
$1;$2 or $1\t$2
as before.


I am not asking you to do the work for me, ...


Oh, I think you did! But no matter. ;^)

I trust this helps.

Brian Barker


--
To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted



Re: [libreoffice-users] Re: Creating a dictionary with libreoffice from a simple TXT-file

2014-12-24 Thread jonathon


On 25/12/14 04:55, Constantine wrote:

 How can I avoid that? The semicolon or tab should be before the number and
 the parenthesis.

It looks like the match is occurring on glyphs that utilize the Latin
writing system, before you get to German text.

 Please tell me this last thing, I really don't know how to.

I'm assuming this is still a text file.

Segregate the material into two files. GREP probably is the easiest tool
to use.
File one is words that contain the left hand parenthesis.
File two is words that do not contain the left hand parenthesis.

For the second file, use the regex that you were using.
For the first file, do the search on the left hand parenthesis.

jonathon


-- 
To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted


Re: [libreoffice-users] Re: Creating a dictionary with libreoffice from a simple TXT-file

2014-12-24 Thread Brian Barker

At 21:55 24/12/2014 -0700, Constantine Marberg wrote:

Dear Brian, you are the greatest.


Er, not quite yet, it appears!

I still have a small problem. What I mean, you 
can see at the following example:

[...]


I cannot quote your example, as my 
under-performing mail client won't do Greek 
characters: sorry. But your examples include 
non-alphabetic characters after the Greek and before the German.


The semicolon or tab should be before the number 
and the parenthesis. Please tell me this last 
thing, I really don't know how to.


Two solutions: take your pick.

1. The previous expression included [^a-z] - 
which matches any non-alphabetic character. You 
need to extend this list to exclude anything else 
that might occur in the second part. Taking just 
your examples, this might include the digits 0 to 
9, a dot, the paragraph mark, and parentheses. 
Note that some of these have meanings in regular 
expressions and you need to escape them in such 
an expression if you want them to be interpreted 
literally. You do this by preceding them with a 
backslash. So you could search for:

([^a-z0-9\.§\(\)]*) (.*)
- but note that you may have to add other rogue 
characters that might occur in your text.


2. Better might be to approach the problem 
directly: searching for the Greek instead of what 
is not Greek (as you indeed originally 
suggested). I avoided this earlier because (1) I 
wasn't sure it would work (it does!) and (2) I 
knew I couldn't actually write out the required 
expression including Greek characters in a 
message. But you should be able to search for:

([a-o ]*) ([^a-o].*)
- but with both those as and os actually 
meaning lower-case Greek alphas and omegas (so 
replaced with those characters), and replace as before.


... it is very important to me, for you to know, 
that I am not one of this lazy guys who let 
others do the work for them without googling or 
reading the documentation themselves first. I 
really try very hard my self first and then ask for help.


Oh, I was just joshing! I didn't think otherwise. 
(Or I might not have cared to investigate and reply.)



I hope I can do something in return for you to show you my appreciation.


No need!

I trust this helps.

Brian Barker


--
To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted



Re: [libreoffice-users] Re: Creating a dictionary with libreoffice from a simple TXT-file

2014-12-24 Thread Brian Barker

At 23:17 24/12/2014 -0700, Constantine Marberg wrote:

you are unbelievable!!!


I hope not!


While I solved the problem with my very sloppy trick ...


Oh, what you did - replacing a text item temporarily with a 
placeholder that won't occur naturally in the text in order to 
simplify a search - is a useful technique and not at all sloppy!



I am just speechless.


Don't be that: I'm guessing you may soon need to say Thank you to Santa.

I know, that even if I had spent 30 more hours studying the manual, 
I would probably never have come to such clean expressions.


I imagine you'll now be able to do such things for yourself: that's 
the idea, of course.



Thank you for everything.


No probs!

Brian Barker 



--
To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted