Re: OT: copy a modern greek-encoded text

2009-12-29 Thread Liviu Andronic
On Mon, Dec 28, 2009 at 4:16 PM, Diego Queiroz queiroz.di...@gmail.com wrote:
 If the translation of this document is very important, I think you have to
 transliterate it by yourself or using an OCR tool (of course, this tool has
 to be able to distinguish Greek characters, so it won't be an easy task).

There is cuneiform and ocropus that might be able to recognize Greek
characters.

Otherwise, try the unprobable solution of pdf to word conversion [1].
Liviu

[1] http://www.pdftoword.com/


Re: OT: copy a modern greek-encoded text

2009-12-29 Thread Liviu Andronic
On Mon, Dec 28, 2009 at 4:16 PM, Diego Queiroz queiroz.di...@gmail.com wrote:
 If the translation of this document is very important, I think you have to
 transliterate it by yourself or using an OCR tool (of course, this tool has
 to be able to distinguish Greek characters, so it won't be an easy task).

There is cuneiform and ocropus that might be able to recognize Greek
characters.

Otherwise, try the unprobable solution of pdf to word conversion [1].
Liviu

[1] http://www.pdftoword.com/


Re: OT: copy a modern greek-encoded text

2009-12-29 Thread Liviu Andronic
On Mon, Dec 28, 2009 at 4:16 PM, Diego Queiroz  wrote:
> If the translation of this document is very important, I think you have to
> transliterate it by yourself or using an OCR tool (of course, this tool has
> to be able to distinguish Greek characters, so it won't be an easy task).
>
There is cuneiform and ocropus that might be able to recognize Greek
characters.

Otherwise, try the unprobable solution of pdf to word conversion [1].
Liviu

[1] http://www.pdftoword.com/


OT: copy a modern greek-encoded text

2009-12-28 Thread Piero Faustini

Hello,I know this is a bit off-topic (well, COMPLETELY off-topic) but I ask in 
this newsl because I know there are some guys with international languages 
coding knoledges who could at least redirect me to some other community for 
this kind of problems.I need to copy some text (ok, I confess I would like to 
use googletranslate to know what's about!) in a pdf document which stays at 
this location: http://www.ionio.gr/~GreekMus/articles/samaras.pdfAnyway, I 
tried to copy the text to some editor/wp but always failed and found only bad 
characters, saving only some words coded in latin alphabet.Maybe a greek user 
can copy and then repaste the text in a suitable format.Any other 
ideas/help?ThanksPiero   
_
25 Gigabyte per le tue foto online
http://www.windowslive.it/foto.aspx

Re: OT: copy a modern greek-encoded text

2009-12-28 Thread Manoj Rajagopalan

Maybe Adobe performs some kind of OCR when you select a piece of text from a 
PDF document and tries to copy what it can infer onto the clipboard. In that 
case it might just not be as good with Greek as it is with English. When i 
tried to select a line, the selection highlight background spilled over to 
neighbouring lines so acrobat reader wasn't very good with identifying line 
boundaries to start with. I am using Acrobat 9.0 on KUbuntu 8.04.

Have you tried googling for phrases like acrobat copy 
greek/international ...

cheers!
Manoj


On Monday 28 December 2009 08:44:20 am Piero Faustini wrote:
 Hello,I know this is a bit off-topic (well, COMPLETELY off-topic) but I ask
 in this newsl because I know there are some guys with international
 languages coding knoledges who could at least redirect me to some other
 community for this kind of problems.I need to copy some text (ok, I confess
 I would like to use googletranslate to know what's about!) in a pdf
 document which stays at this location:
 http://www.ionio.gr/~GreekMus/articles/samaras.pdfAnyway, I tried to copy
 the text to some editor/wp but always failed and found only bad characters,
 saving only some words coded in latin alphabet.Maybe a greek user can copy
 and then repaste the text in a suitable format.Any other
 ideas/help?ThanksPiero
 _
 25 Gigabyte per le tue foto online
 http://www.windowslive.it/foto.aspx



Re: OT: copy a modern greek-encoded text

2009-12-28 Thread Diego Queiroz
Piero Faustini,

This problem is very dificult to deal with.

It is happening because the document is not using Unicode. It is using
common ANSI format with an unusual font with Greek characters on place of
the ones you are used to (the font is embedded on the PDF file). This is why
you see bad characters when you copy and paste its content: your editor is
using a font with a different encoding than the pasted data.

Even if you extract the fonts ans install them (I think you can do this
using FontForge http://fontforge.sourceforge.net/ ), it won't solve your
problem, because Google (and most translators/tools on the web) need
Unicode. :-(

There is a tool on the web I use often which converts latin text to slavic
formats (includind Greek): http://www.translit.ru/?direction=gr .
But the font on this document is using a very different encoding so it did
not work also. :-(

If the translation of this document is very important, I think you have to
transliterate it by yourself or using an OCR tool (of course, this tool has
to be able to distinguish Greek characters, so it won't be an easy task).


Regards,
---
Diego Queiroz


On Mon, Dec 28, 2009 at 12:31 PM, Manoj Rajagopalan rma...@umich.eduwrote:


 Maybe Adobe performs some kind of OCR when you select a piece of text from
 a
 PDF document and tries to copy what it can infer onto the clipboard. In
 that
 case it might just not be as good with Greek as it is with English. When i
 tried to select a line, the selection highlight background spilled over to
 neighbouring lines so acrobat reader wasn't very good with identifying line
 boundaries to start with. I am using Acrobat 9.0 on KUbuntu 8.04.

 Have you tried googling for phrases like acrobat copy
 greek/international ...

 cheers!
 Manoj


 On Monday 28 December 2009 08:44:20 am Piero Faustini wrote:
  Hello,I know this is a bit off-topic (well, COMPLETELY off-topic) but I
 ask
  in this newsl because I know there are some guys with international
  languages coding knoledges who could at least redirect me to some other
  community for this kind of problems.I need to copy some text (ok, I
 confess
  I would like to use googletranslate to know what's about!) in a pdf
  document which stays at this location:
  http://www.ionio.gr/~GreekMus/articles/samaras.pdfAnyway, I tried to
 copy
  the text to some editor/wp but always failed and found only bad
 characters,
  saving only some words coded in latin alphabet.Maybe a greek user can
 copy
  and then repaste the text in a suitable format.Any other
  ideas/help?ThanksPiero
  _
  25 Gigabyte per le tue foto online
  http://www.windowslive.it/foto.aspx




Re: OT: copy a modern greek-encoded text

2009-12-28 Thread Piero

 
 If the translation of this document is very important, I think you have to
 transliterate it by yourself or using an OCR tool (of course, this tool has
 to be able to distinguish Greek characters, so it won't be an easy task).

Thanks, I will try to contact the author himself and ask him for a translation 
and/or the original file.

Thank you very much!!!




OT: copy a modern greek-encoded text

2009-12-28 Thread Piero Faustini

Hello,I know this is a bit off-topic (well, COMPLETELY off-topic) but I ask in 
this newsl because I know there are some guys with international languages 
coding knoledges who could at least redirect me to some other community for 
this kind of problems.I need to copy some text (ok, I confess I would like to 
use googletranslate to know what's about!) in a pdf document which stays at 
this location: http://www.ionio.gr/~GreekMus/articles/samaras.pdfAnyway, I 
tried to copy the text to some editor/wp but always failed and found only bad 
characters, saving only some words coded in latin alphabet.Maybe a greek user 
can copy and then repaste the text in a suitable format.Any other 
ideas/help?ThanksPiero   
_
25 Gigabyte per le tue foto online
http://www.windowslive.it/foto.aspx

Re: OT: copy a modern greek-encoded text

2009-12-28 Thread Manoj Rajagopalan

Maybe Adobe performs some kind of OCR when you select a piece of text from a 
PDF document and tries to copy what it can infer onto the clipboard. In that 
case it might just not be as good with Greek as it is with English. When i 
tried to select a line, the selection highlight background spilled over to 
neighbouring lines so acrobat reader wasn't very good with identifying line 
boundaries to start with. I am using Acrobat 9.0 on KUbuntu 8.04.

Have you tried googling for phrases like acrobat copy 
greek/international ...

cheers!
Manoj


On Monday 28 December 2009 08:44:20 am Piero Faustini wrote:
 Hello,I know this is a bit off-topic (well, COMPLETELY off-topic) but I ask
 in this newsl because I know there are some guys with international
 languages coding knoledges who could at least redirect me to some other
 community for this kind of problems.I need to copy some text (ok, I confess
 I would like to use googletranslate to know what's about!) in a pdf
 document which stays at this location:
 http://www.ionio.gr/~GreekMus/articles/samaras.pdfAnyway, I tried to copy
 the text to some editor/wp but always failed and found only bad characters,
 saving only some words coded in latin alphabet.Maybe a greek user can copy
 and then repaste the text in a suitable format.Any other
 ideas/help?ThanksPiero
 _
 25 Gigabyte per le tue foto online
 http://www.windowslive.it/foto.aspx



Re: OT: copy a modern greek-encoded text

2009-12-28 Thread Diego Queiroz
Piero Faustini,

This problem is very dificult to deal with.

It is happening because the document is not using Unicode. It is using
common ANSI format with an unusual font with Greek characters on place of
the ones you are used to (the font is embedded on the PDF file). This is why
you see bad characters when you copy and paste its content: your editor is
using a font with a different encoding than the pasted data.

Even if you extract the fonts ans install them (I think you can do this
using FontForge http://fontforge.sourceforge.net/ ), it won't solve your
problem, because Google (and most translators/tools on the web) need
Unicode. :-(

There is a tool on the web I use often which converts latin text to slavic
formats (includind Greek): http://www.translit.ru/?direction=gr .
But the font on this document is using a very different encoding so it did
not work also. :-(

If the translation of this document is very important, I think you have to
transliterate it by yourself or using an OCR tool (of course, this tool has
to be able to distinguish Greek characters, so it won't be an easy task).


Regards,
---
Diego Queiroz


On Mon, Dec 28, 2009 at 12:31 PM, Manoj Rajagopalan rma...@umich.eduwrote:


 Maybe Adobe performs some kind of OCR when you select a piece of text from
 a
 PDF document and tries to copy what it can infer onto the clipboard. In
 that
 case it might just not be as good with Greek as it is with English. When i
 tried to select a line, the selection highlight background spilled over to
 neighbouring lines so acrobat reader wasn't very good with identifying line
 boundaries to start with. I am using Acrobat 9.0 on KUbuntu 8.04.

 Have you tried googling for phrases like acrobat copy
 greek/international ...

 cheers!
 Manoj


 On Monday 28 December 2009 08:44:20 am Piero Faustini wrote:
  Hello,I know this is a bit off-topic (well, COMPLETELY off-topic) but I
 ask
  in this newsl because I know there are some guys with international
  languages coding knoledges who could at least redirect me to some other
  community for this kind of problems.I need to copy some text (ok, I
 confess
  I would like to use googletranslate to know what's about!) in a pdf
  document which stays at this location:
  http://www.ionio.gr/~GreekMus/articles/samaras.pdfAnyway, I tried to
 copy
  the text to some editor/wp but always failed and found only bad
 characters,
  saving only some words coded in latin alphabet.Maybe a greek user can
 copy
  and then repaste the text in a suitable format.Any other
  ideas/help?ThanksPiero
  _
  25 Gigabyte per le tue foto online
  http://www.windowslive.it/foto.aspx




Re: OT: copy a modern greek-encoded text

2009-12-28 Thread Piero

 
 If the translation of this document is very important, I think you have to
 transliterate it by yourself or using an OCR tool (of course, this tool has
 to be able to distinguish Greek characters, so it won't be an easy task).

Thanks, I will try to contact the author himself and ask him for a translation 
and/or the original file.

Thank you very much!!!




OT: copy a modern greek-encoded text

2009-12-28 Thread Piero Faustini

Hello,I know this is a bit off-topic (well, COMPLETELY off-topic) but I ask in 
this newsl because I know there are some guys with international languages 
coding knoledges who could at least redirect me to some other community for 
this kind of problems.I need to copy some text (ok, I confess I would like to 
use googletranslate to know what's about!) in a pdf document which stays at 
this location: http://www.ionio.gr/~GreekMus/articles/samaras.pdfAnyway, I 
tried to copy the text to some editor/wp but always failed and found only bad 
characters, saving only some words coded in latin alphabet.Maybe a greek user 
can copy and then repaste the text in a suitable format.Any other 
ideas/help?ThanksPiero   
_
25 Gigabyte per le tue foto online
http://www.windowslive.it/foto.aspx

Re: OT: copy a modern greek-encoded text

2009-12-28 Thread Manoj Rajagopalan

Maybe Adobe performs some kind of OCR when you select a piece of text from a 
PDF document and tries to copy what it can infer onto the clipboard. In that 
case it might just not be as good with Greek as it is with English. When i 
tried to select a line, the selection highlight background spilled over to 
neighbouring lines so acrobat reader wasn't very good with identifying line 
boundaries to start with. I am using Acrobat 9.0 on KUbuntu 8.04.

Have you tried googling for phrases like "acrobat copy 
greek/international ..."

cheers!
Manoj


On Monday 28 December 2009 08:44:20 am Piero Faustini wrote:
> Hello,I know this is a bit off-topic (well, COMPLETELY off-topic) but I ask
> in this newsl because I know there are some guys with international
> languages coding knoledges who could at least redirect me to some other
> community for this kind of problems.I need to copy some text (ok, I confess
> I would like to use googletranslate to know what's about!) in a pdf
> document which stays at this location:
> http://www.ionio.gr/~GreekMus/articles/samaras.pdfAnyway, I tried to copy
> the text to some editor/wp but always failed and found only bad characters,
> saving only some words coded in latin alphabet.Maybe a greek user can copy
> and then repaste the text in a suitable format.Any other
> ideas/help?ThanksPiero
> _
> 25 Gigabyte per le tue foto online
> http://www.windowslive.it/foto.aspx



Re: OT: copy a modern greek-encoded text

2009-12-28 Thread Diego Queiroz
Piero Faustini,

This problem is very dificult to deal with.

It is happening because the document is not using Unicode. It is using
common ANSI format with an unusual font with Greek characters on place of
the ones you are used to (the font is embedded on the PDF file). This is why
you see "bad characters" when you copy and paste its content: your editor is
using a font with a different encoding than the pasted data.

Even if you extract the fonts ans install them (I think you can do this
using FontForge http://fontforge.sourceforge.net/ ), it won't solve your
problem, because Google (and most translators/tools on the web) need
Unicode. :-(

There is a tool on the web I use often which converts latin text to slavic
formats (includind Greek): http://www.translit.ru/?direction=gr .
But the font on this document is using a very different encoding so it did
not work also. :-(

If the translation of this document is very important, I think you have to
transliterate it by yourself or using an OCR tool (of course, this tool has
to be able to distinguish Greek characters, so it won't be an easy task).


Regards,
---
Diego Queiroz


On Mon, Dec 28, 2009 at 12:31 PM, Manoj Rajagopalan wrote:

>
> Maybe Adobe performs some kind of OCR when you select a piece of text from
> a
> PDF document and tries to copy what it can infer onto the clipboard. In
> that
> case it might just not be as good with Greek as it is with English. When i
> tried to select a line, the selection highlight background spilled over to
> neighbouring lines so acrobat reader wasn't very good with identifying line
> boundaries to start with. I am using Acrobat 9.0 on KUbuntu 8.04.
>
> Have you tried googling for phrases like "acrobat copy
> greek/international ..."
>
> cheers!
> Manoj
>
>
> On Monday 28 December 2009 08:44:20 am Piero Faustini wrote:
> > Hello,I know this is a bit off-topic (well, COMPLETELY off-topic) but I
> ask
> > in this newsl because I know there are some guys with international
> > languages coding knoledges who could at least redirect me to some other
> > community for this kind of problems.I need to copy some text (ok, I
> confess
> > I would like to use googletranslate to know what's about!) in a pdf
> > document which stays at this location:
> > http://www.ionio.gr/~GreekMus/articles/samaras.pdfAnyway, I tried to
> copy
> > the text to some editor/wp but always failed and found only bad
> characters,
> > saving only some words coded in latin alphabet.Maybe a greek user can
> copy
> > and then repaste the text in a suitable format.Any other
> > ideas/help?ThanksPiero
> > _
> > 25 Gigabyte per le tue foto online
> > http://www.windowslive.it/foto.aspx
>
>


Re: OT: copy a modern greek-encoded text

2009-12-28 Thread Piero

> 
> If the translation of this document is very important, I think you have to
> transliterate it by yourself or using an OCR tool (of course, this tool has
> to be able to distinguish Greek characters, so it won't be an easy task).

Thanks, I will try to contact the author himself and ask him for a translation 
and/or the original file.

Thank you very much!!!