Re: OT: copy a modern greek-encoded text

2009-12-29 Thread Liviu Andronic
On Mon, Dec 28, 2009 at 4:16 PM, Diego Queiroz  wrote:
> If the translation of this document is very important, I think you have to
> transliterate it by yourself or using an OCR tool (of course, this tool has
> to be able to distinguish Greek characters, so it won't be an easy task).
>
There is cuneiform and ocropus that might be able to recognize Greek
characters.

Otherwise, try the unprobable solution of pdf to word conversion [1].
Liviu

[1] http://www.pdftoword.com/


Re: OT: copy a modern greek-encoded text

2009-12-28 Thread Piero

> 
> If the translation of this document is very important, I think you have to
> transliterate it by yourself or using an OCR tool (of course, this tool has
> to be able to distinguish Greek characters, so it won't be an easy task).

Thanks, I will try to contact the author himself and ask him for a translation 
and/or the original file.

Thank you very much!!!




Re: OT: copy a modern greek-encoded text

2009-12-28 Thread Diego Queiroz
Piero Faustini,

This problem is very dificult to deal with.

It is happening because the document is not using Unicode. It is using
common ANSI format with an unusual font with Greek characters on place of
the ones you are used to (the font is embedded on the PDF file). This is why
you see "bad characters" when you copy and paste its content: your editor is
using a font with a different encoding than the pasted data.

Even if you extract the fonts ans install them (I think you can do this
using FontForge http://fontforge.sourceforge.net/ ), it won't solve your
problem, because Google (and most translators/tools on the web) need
Unicode. :-(

There is a tool on the web I use often which converts latin text to slavic
formats (includind Greek): http://www.translit.ru/?direction=gr .
But the font on this document is using a very different encoding so it did
not work also. :-(

If the translation of this document is very important, I think you have to
transliterate it by yourself or using an OCR tool (of course, this tool has
to be able to distinguish Greek characters, so it won't be an easy task).


Regards,
---
Diego Queiroz


On Mon, Dec 28, 2009 at 12:31 PM, Manoj Rajagopalan wrote:

>
> Maybe Adobe performs some kind of OCR when you select a piece of text from
> a
> PDF document and tries to copy what it can infer onto the clipboard. In
> that
> case it might just not be as good with Greek as it is with English. When i
> tried to select a line, the selection highlight background spilled over to
> neighbouring lines so acrobat reader wasn't very good with identifying line
> boundaries to start with. I am using Acrobat 9.0 on KUbuntu 8.04.
>
> Have you tried googling for phrases like "acrobat copy
> greek/international ..."
>
> cheers!
> Manoj
>
>
> On Monday 28 December 2009 08:44:20 am Piero Faustini wrote:
> > Hello,I know this is a bit off-topic (well, COMPLETELY off-topic) but I
> ask
> > in this newsl because I know there are some guys with international
> > languages coding knoledges who could at least redirect me to some other
> > community for this kind of problems.I need to copy some text (ok, I
> confess
> > I would like to use googletranslate to know what's about!) in a pdf
> > document which stays at this location:
> > http://www.ionio.gr/~GreekMus/articles/samaras.pdfAnyway, I tried to
> copy
> > the text to some editor/wp but always failed and found only bad
> characters,
> > saving only some words coded in latin alphabet.Maybe a greek user can
> copy
> > and then repaste the text in a suitable format.Any other
> > ideas/help?ThanksPiero
> > _
> > 25 Gigabyte per le tue foto online
> > http://www.windowslive.it/foto.aspx
>
>


Re: OT: copy a modern greek-encoded text

2009-12-28 Thread Manoj Rajagopalan

Maybe Adobe performs some kind of OCR when you select a piece of text from a 
PDF document and tries to copy what it can infer onto the clipboard. In that 
case it might just not be as good with Greek as it is with English. When i 
tried to select a line, the selection highlight background spilled over to 
neighbouring lines so acrobat reader wasn't very good with identifying line 
boundaries to start with. I am using Acrobat 9.0 on KUbuntu 8.04.

Have you tried googling for phrases like "acrobat copy 
greek/international ..."

cheers!
Manoj


On Monday 28 December 2009 08:44:20 am Piero Faustini wrote:
> Hello,I know this is a bit off-topic (well, COMPLETELY off-topic) but I ask
> in this newsl because I know there are some guys with international
> languages coding knoledges who could at least redirect me to some other
> community for this kind of problems.I need to copy some text (ok, I confess
> I would like to use googletranslate to know what's about!) in a pdf
> document which stays at this location:
> http://www.ionio.gr/~GreekMus/articles/samaras.pdfAnyway, I tried to copy
> the text to some editor/wp but always failed and found only bad characters,
> saving only some words coded in latin alphabet.Maybe a greek user can copy
> and then repaste the text in a suitable format.Any other
> ideas/help?ThanksPiero
> _
> 25 Gigabyte per le tue foto online
> http://www.windowslive.it/foto.aspx



OT: copy a modern greek-encoded text

2009-12-28 Thread Piero Faustini

Hello,I know this is a bit off-topic (well, COMPLETELY off-topic) but I ask in 
this newsl because I know there are some guys with international languages 
coding knoledges who could at least redirect me to some other community for 
this kind of problems.I need to copy some text (ok, I confess I would like to 
use googletranslate to know what's about!) in a pdf document which stays at 
this location: http://www.ionio.gr/~GreekMus/articles/samaras.pdfAnyway, I 
tried to copy the text to some editor/wp but always failed and found only bad 
characters, saving only some words coded in latin alphabet.Maybe a greek user 
can copy and then repaste the text in a suitable format.Any other 
ideas/help?ThanksPiero   
_
25 Gigabyte per le tue foto online
http://www.windowslive.it/foto.aspx