Thanks for the responses. The URL is behind a login screen so there is
no way for me to share it directly. I am pretty sure that the problem
is with the page encoding, however, as you've both suggested.
FireFox's View>Character Encoding just gives other kinds of garbled
text. But I have figured out why, I think. When I view source, the
garbled Cyrillic is really all encoded entities like this (mixed with
Latin accented characters): ńęîă in a page that is
rendering as charset=iso-8859-1.

So maybe that is a starting point for me, but I'm guessing this isn't
really a BBEdit topic anymore, so I'll just proceed from here unless
there are any further suggestions.

Thanks.
--
Lloyd Dunn
http://nula.cc/
http://blog.nula.cc/



On Mar 10, 9:16 pm, "Robert A. Rosenberg" <[email protected]> wrote:
> At 10:00 AM +0100 on 03/10/2011, Lloyd Dunn wrote about transliterate
> into cyrillic:
>
> >Below are a few examples of garbled Cyrilic from a web page (this
> >happens to be a CD track list).
>
> >Is there a simple direct way to transliterate or re-encode these into
> >proper Cyrillic characters using BBEdit? I've tried all the charsets
> >in the 'Reopen using encoding' submenu, but to no avail.
>
> >I've done this (usually imperfectly) in the past using online
> >converters and hacky freeware, but I'd really like to accomplish this
> >task within BBEdit.
>
> >Any insights welcome.
>
> >001. �橢���펑 �뎩���� (������) - 
> >����� � �����t� ������
> >002. � ��矴�a�� Ď����� �-��� - 
> >���� �t����� ��ᩢ�ގ�� � A�����
> >003. � ��矴�a�� Ď����� �-��� - 
> >���� a��a��� ��� ���玴���� 
> >��������
> >004. � ��矴�a�� Ď����� �-��� - 
> >ˎ玩��� ����� ������
> >005. � ��矴�a�� Ď����� �-��� - 
> >���� ��� ������� ������
>
> >--
> >Lloyd Dunn
> >http://nula.cc/
> >http://blog.nula.cc/
>
> >--
> >You received this message because you are subscribed to the
> >"BBEdit Talk" discussion group on Google Groups.
> >To post to this group, send email to [email protected]
> >To unsubscribe from this group, send email to
> >[email protected]
> >For more options, visit this group at
> ><http://groups.google.com/group/bbedit?hl=en>
> >If you have a feature request or would like to report a problem,
> >please email "[email protected]" rather than posting to the group.
> >Follow @bbedit on Twitter: <http://www.twitter.com/bbedit>
>
> This looks like the page is declared as ISO-8859-1 in the meta tag
> instead of utf-8. User the source/page view option to check. Try
> telling your browser to display it as Character Set UTF-8. What is
> the URL? I can look at it for you if you want. When you see the
> characters in groups of 3 (and they are all accented) that is a tip
> off for utf-8. If you look up the ISO-8859-1 codepoint of the
> characters in (for example) � you can see how it converts to the
> UTF-8 coding and see if it is the Cyrillic Unicode range. The only
> problem with this is that Cyrillic is 2 byte not 3 byte UTF-8
> encoding.

-- 
You received this message because you are subscribed to the 
"BBEdit Talk" discussion group on Google Groups.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
<http://groups.google.com/group/bbedit?hl=en>
If you have a feature request or would like to report a problem, 
please email "[email protected]" rather than posting to the group.
Follow @bbedit on Twitter: <http://www.twitter.com/bbedit>

Reply via email to