|
Hi, I am trying to read a word
document using Win32::OLE. I am
able to open the document the paragraphs successfully if the contents of the
document is in English. But I have a document containing English and Japanese mixed
content. I am getting ‘?’ in place of Japanese characters. Can any body suggest
me how to get the text without ‘?’ symbols. use strict; use Win32::OLE; use Win32::OLE::Const 'Microsoft Word'; use Win32::Clipboard; my $word_file; my $Word; my $document; my $paragraphs; my $paragraph; my $enumerate; my $text; $word_file = 'test.doc'; $Word = Win32::OLE->new('Word.Application', 'Quit'); $Word->{'Visible'}
= 1; $document = $Word->Documents->Open($word_file) || die("Unable to open document"); $Word->{Language}
= 1041; $Word->{WdOpenFormat} = 5; $Word->{WdSaveFormat} = 7; $paragraphs = $document->Paragraphs() ; $enumerate = new Win32::OLE::Enum($paragraphs); while(defined($paragraph = $enumerate->Next())) { $paragraph->{Range}->{LanguageID} = 1041
; $paragraph->{Range}->{LanguageIDFarEast}
= 1041 ; $text = $paragraph->{Range}->{Text} ; print
"$text\n" ; } $Word->ActiveDocument->Close ; $Word->Quit; The test.doc file contains the following line こんにちは、皆 means Hello Everybody Thanks in advance Lalith
|
_______________________________________________ Perl-Win32-Admin mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
