Re: What about use xml for descriptions of packages?

2008-05-25 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 05/25/08 13:03, Chris Bannister wrote:
> On Sun, May 25, 2008 at 08:29:56AM -0500, Ron Johnson wrote:
>> What's an extra few MB plus parsing overhead when "everyone" has
>> 250GB HDDs, multi-core 64-bit CPUs and 2+GB RAM?
>>
> 
> Huh?. Why commit "good" machines to the landfill?

Because... Newer Is Better, Older is Eviler.

- --
Ron Johnson, Jr.
Jefferson LA  USA

ESPN makes baseball players better.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFIOa8gS9HxQb37XmcRAnDUAJ43RuH5zivlYjGBjoHC8VrjMkAScgCeLadw
gNbGuQM5Xsz3851GfxaSVaU=
=pKYz
-END PGP SIGNATURE-


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: What about use xml for descriptions of packages?

2008-05-25 Thread Chris Bannister
On Sun, May 25, 2008 at 08:29:56AM -0500, Ron Johnson wrote:
> What's an extra few MB plus parsing overhead when "everyone" has
> 250GB HDDs, multi-core 64-bit CPUs and 2+GB RAM?
> 

Huh?. Why commit "good" machines to the landfill?

-- 
Chris.
==
"One, with God, is always a majority, but many a martyr has been burned
   at the stake while the votes were being counted."  -- Thomas B. Reed


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: What about use xml for descriptions of packages?

2008-05-25 Thread Manuel Prinz
First of all, I did not get it in my first reply that you spoke from a
translaters point of view. I just have a very limited view on
translation work, so my arguments may not be correct.

Am Sonntag, den 25.05.2008, 16:05 +0200 schrieb Fernando Cerezal:
> How can a program know if
> 
>  * A description sentence with
> two lines
> 
> and
> 
>  * A description sentence with
> two lines
> 
> are the same?

As always: by definition. If you define that indentation matters, first
one would be treated as two lines and the second as one. If you define
that indentation does not matter and bullet points have to be seperated
by an empty line (I think POD does something like this), then the first
entry is one line. So the problem is a missing definition of how the
formating should be treated by a parser, not a new document format.

> Or other similar problem: The url of project web page changes, like this:
> 
> Upstream URL: http://www.example.com
> Upstream URL: http://www.example.com/project
> 
> This is handled like a modification of the translation, and a
> translator for every language needs to review that, and «translate»
> it.

For this case, the Homepage field exists. By using that, the Description
does not need to be touched in any way.

> The problem is that the number of descriptions translated grows lower
> than the number of packages, and besides this, the translators have to
> review older descriptions that have not new information, but change of
> format, change of URLs and so on.

I see the burdon that it brings to translators but I do not understand
how XML may resolve those. URL should IMHO not be in the description as
they are subject to change and a better alternative exists. If the
format changes, the problem is to detect whether it is a "real" change.
Checking for changes in whitespace are equally easy to detect in the old
and an XML-based format. If new points are added to list or a point is
split into two, you can not automatically detect and correct those in
either format.

> I think using XML, or HTML, and embed a tiny web browser into
> synaptic, we can resolve part of the problem for translations and use
> the benefits of HTML, like including images, real links for web pages,
> links between packages that can be "easyly" handled.

I'm still not convinced that these features have a benefit.

> I write XML because is something I know. I mean markup tags.

This is valid but I think there are better markup languages for that
purpose that do not effect the readability of the document. Also, a lot
of parsers for these language exists and they are easily transformable
into other formats, such as (X)HTML.

> The descriptions have a lot of lists, execute strings, urls, and so on.
> If an item of a lists changes, we have to review all the list, find
> the change, do it in the translations and send it to revision process.

How does XML prevent a review here?

> There is no convention with this [Homepage field]. Or if there is, it is not 
> used.

There is. If it's not used, you can file wishlist bugs.

> > I still do not see why the current scheme is limited. Can you give
> > examples where special markup or links in the texts may be useful?
> 
> lists, links, non-translatable strings (URLs, numbers of version),
> changes in tabulation...

All such things are covered by more readable markup languages, too. I
agree that lists are a problem with the current format.

> Perhaps my question is bad presented and it introduces noise to the
> list. I'm sorry so.

I do not think so. As said, at least I did not see that you spoke from a
translator's point of view at first.

Best regards
Manuel


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: What about use xml for descriptions of packages?

2008-05-25 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 05/25/08 08:34, David Paleino wrote:
> On Sun, 25 May 2008 08:29:56 -0500, Ron Johnson wrote:
> 
>> What's an extra few MB plus parsing overhead when "everyone" has
>> 250GB HDDs, multi-core 64-bit CPUs and 2+GB RAM?
> 
> Well, and what about !i386, !amd64 and !powerpc ? ;)

Suffer like the 3rd-class citizens they are!

- --
Ron Johnson, Jr.
Jefferson LA  USA

ESPN makes baseball players better.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFIOXbNS9HxQb37XmcRAs/3AKC7SGpRrXUE+QM06pN4bmf1nJlOvACg4kMs
MrqrfpsAFSLEW8oqNl9hL9g=
=/nRd
-END PGP SIGNATURE-


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: What about use xml for descriptions of packages?

2008-05-25 Thread Fernando Cerezal
2008/5/25 Manuel Prinz <[EMAIL PROTECTED]>:
> Am Sonntag, den 25.05.2008, 14:40 +0200 schrieb Fernando Cerezal:
>> I'm thinking about advantages and disadvantages of write the
>> description of the packages using XML.
>
> I like XML but it's a huge pain to write by hand. The current format is
> easy to read, easy to write and easy to parse. This is important and
> definitely not the case if it is XML.

How can a program know if

 * A description sentence with
two lines

and

 * A description sentence with
two lines

are the same?

Or other similar problem: The url of project web page changes, like this:

Upstream URL: http://www.example.com
Upstream URL: http://www.example.com/project

This is handled like a modification of the translation, and a
translator for every language needs to review that, and «translate»
it.

Sorry, perhaps XML is not the solution, but I prefer present a problem
with a possible solution than simplely the problem.
The problem is that the number of descriptions translated grows lower
than the number of packages, and besides this, the translators have to
review older descriptions that have not new information, but change of
format, change of URLs and so on.

I think using XML, or HTML, and embed a tiny web browser into
synaptic, we can resolve part of the problem for translations and use
the benefits of HTML, like including images, real links for web pages,
links between packages that can be "easyly" handled.

>
>> I think using XML the descriptions can be rendered in different form
>> for text and graphical tools.
>
> If this something that is really needed, one could think about more sane
> such as allowing Markdown[1] for Description field.

I write XML because is something I know. I mean markup tags.

>  I thought about that
> for a while now but never really saw the need to have formatted text or
> code examples in a package description.

The descriptions have a lot of lists, execute strings, urls, and so on.
If an item of a lists changes, we have to review all the list, find
the change, do it in the translations and send it to revision process.

>  IMHO those just do not belong
> there. And for cases where links are needed, special fields as the
> Homepage field exists which are properly shown in most tools.

There is no convention with this. Or if there is, it is not used.

>
>> As disadvantage, we will need to develop a robot that do format for
>> the current descriptions and translations and we will have to review
>> all of them. Ciertainly, this will be an huge job.
>
> I'd rather convince people to adpot a new format, not to decide for them
> whether they want it or not. Change by evolution, not revolution.

I agree, I write to the list asking if there were any previous discussion.

>
>> However, I think the current scheme of descriptions is very limitied
>> and is better to do it earlier than translate more descriptions and
>> then move all to a new format in the future.
>
> I still do not see why the current scheme is limited. Can you give
> examples where special markup or links in the texts may be useful?

lists, links, non-translatable strings (URLs, numbers of version),
changes in tabulation...
And all other semantic issues and where form and content are mixed.

I don't know how is the backend of dpkg, but I know how is the
translator work and I think this thing could do the process more
efficiente.
Perhaps my question is bad presented and it introduces noise to the
list. I'm sorry so.

>
> Best regards

Regards.

> Manuel
>
> [1] http://en.wikipedia.org/wiki/Markdown
>
>
>
> --
> To UNSUBSCRIBE, email to [EMAIL PROTECTED]
> with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
>
>


Re: What about use xml for descriptions of packages?

2008-05-25 Thread Noah Slater
On Sun, May 25, 2008 at 08:46:22AM -0400, Roberto C. Sánchez wrote:
> On Sun, May 25, 2008 at 02:40:07PM +0200, Fernando Cerezal wrote:
> > I'm thinking about advantages and disadvantages of write the
> > description of the packages using XML.
>
> Personally, I would hate this.  I've written too many ant build.xml
> scripts to think that writing XML by hand is even a remotely sane thing
> to do.

Same here.

  A lot of people when faced with a problem think "I know, I'll use XML!"

  Now they have two problems.

Seriously though, XML is often used for no apparent reason other than it being
"trendy" or "cool" or whatnot. I think this is one of those times.

Some initial problems I can think of:

  * You can't "just" use XML, you have to use a dialect. Dialects require
schemas and schemas are Hard.

  * XML is hard to edit and prone to errors when done by hand.

  * XML would be very hard to format by hand when embedded within RFC 2822, the
format of the debian/control file.

  * XML is great for complex content that requires many degrees of freedom and
processing possibilities, non of which really apply to package descriptions.

  * XML even when used is usually better when derived from some other format,
such as a light text based markup language. Think AsciiDoc, Markdown or 
REsT.

Some initial questions I can think of:

  * What would XML buy us that plain text doesn't?

  * Do those benefits outweigh all the negative issues.

  * Could something more light weight be chosen instead?

Best,

-- 
Noah Slater - Bytesexual 


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: What about use xml for descriptions of packages?

2008-05-25 Thread Neil Williams
On Sun, 2008-05-25 at 08:29 -0500, Ron Johnson wrote:
> > 13 x 10 x 20,000 = bloat.
> 
> It would probably be more like one paragraph per .

Still far too much.

> > Now that really is out of the question - please remember that the
> > packages descriptions go into the dpkg database which is already too
> > big.
> 
> What's an extra few MB plus parsing overhead when "everyone" has
> 250GB HDDs, multi-core 64-bit CPUs and 2+GB RAM?

I am going to assume you are not being serious.

Try 64Mb Flash, ARM5 CPU and 64Mb RAM.

-- 


Neil Williams
=
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/




signature.asc
Description: This is a digitally signed message part


Re: What about use xml for descriptions of packages?

2008-05-25 Thread David Paleino
On Sun, 25 May 2008 08:29:56 -0500, Ron Johnson wrote:

> What's an extra few MB plus parsing overhead when "everyone" has
> 250GB HDDs, multi-core 64-bit CPUs and 2+GB RAM?

Well, and what about !i386, !amd64 and !powerpc ? ;)

-- 
 . ''`.  Debian maintainer | http://wiki.debian.org/DavidPaleino
 : :'  : Linuxer #334216 --|-- http://www.hanskalabs.net/
 `. `'`  GPG: 1392B174 | http://snipr.com/qa_page
   `-   2BAB C625 4E66 E7B8 450A C3E1 E6AA 9017 1392 B174


signature.asc
Description: PGP signature


Re: What about use xml for descriptions of packages?

2008-05-25 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 05/25/08 08:17, Neil Williams wrote:
> On Sun, 2008-05-25 at 15:07 +0200, Fernando Cerezal wrote:
[snip]
> 
>> A description with lines
> 
> Is an extra 13 characters per line, per description, per package.
> 
> 13 x 10 x 20,000 = bloat.

It would probably be more like one paragraph per .

>> The program foo is used for help you in your task > src="http://logo_of_foo.jpg";>
> 
> Now that really is out of the question - please remember that the
> packages descriptions go into the dpkg database which is already too
> big.

What's an extra few MB plus parsing overhead when "everyone" has
250GB HDDs, multi-core 64-bit CPUs and 2+GB RAM?

- --
Ron Johnson, Jr.
Jefferson LA  USA

ESPN makes baseball players better.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFIOWnUS9HxQb37XmcRAssUAJ4w+nYc+jGxTSGID8Y5LldxRfaMUQCdFva0
CVCmI+LtEmE0YjvljXmy/CY=
=+o0E
-END PGP SIGNATURE-


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: What about use xml for descriptions of packages?

2008-05-25 Thread Manuel Prinz
Am Sonntag, den 25.05.2008, 14:40 +0200 schrieb Fernando Cerezal:
> I'm thinking about advantages and disadvantages of write the
> description of the packages using XML.

I like XML but it's a huge pain to write by hand. The current format is
easy to read, easy to write and easy to parse. This is important and
definitely not the case if it is XML.

> I think using XML the descriptions can be rendered in different form
> for text and graphical tools. 

If this something that is really needed, one could think about more sane
such as allowing Markdown[1] for Description field. I thought about that
for a while now but never really saw the need to have formatted text or
code examples in a package description. IMHO those just do not belong
there. And for cases where links are needed, special fields as the
Homepage field exists which are properly shown in most tools.

> As disadvantage, we will need to develop a robot that do format for
> the current descriptions and translations and we will have to review
> all of them. Ciertainly, this will be an huge job.

I'd rather convince people to adpot a new format, not to decide for them
whether they want it or not. Change by evolution, not revolution.

> However, I think the current scheme of descriptions is very limitied
> and is better to do it earlier than translate more descriptions and
> then move all to a new format in the future.

I still do not see why the current scheme is limited. Can you give
examples where special markup or links in the texts may be useful?

Best regards
Manuel

[1] http://en.wikipedia.org/wiki/Markdown



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: What about use xml for descriptions of packages?

2008-05-25 Thread Neil Williams
On Sun, 2008-05-25 at 15:07 +0200, Fernando Cerezal wrote:
> 2008/5/25 Roberto C. Sánchez <[EMAIL PROTECTED]>:
> > On Sun, May 25, 2008 at 02:40:07PM +0200, Fernando Cerezal wrote:
> Yes, you are right. However, currently the translations of the Debian
> website are being done by hand, so there is the same problem and it
> works enough fine. 

What are we talking about here, www.debian.org or apt-cache show?

These are two very different issues.

> Are interpreted as two difference descriptions. 

They would be in gettext - unless marked up in such a way as to not
include the extra spaces.

> A description with lines

Is an extra 13 characters per line, per description, per package.

13 x 10 x 20,000 = bloat.

> The program foo is used for help you in your task  src="http://logo_of_foo.jpg";>

Now that really is out of the question - please remember that the
packages descriptions go into the dpkg database which is already too
big.

-- 


Neil Williams
=
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/




signature.asc
Description: This is a digitally signed message part


Re: What about use xml for descriptions of packages?

2008-05-25 Thread Neil Williams
On Sun, 2008-05-25 at 08:46 -0400, Roberto C. Sánchez wrote:
> On Sun, May 25, 2008 at 02:40:07PM +0200, Fernando Cerezal wrote:
> > Hello,
> > I'm thinking about advantages and disadvantages of write the
> > description of the packages using XML.
> 
> Personally, I would hate this.  I've written too many ant build.xml
> scripts to think that writing XML by hand is even a remotely sane thing
> to do.

I agree - I write a lot of XML and XSL and write various XML backends
for various upstream projects. It sounds like a very bad idea to me,
especially considering the extra workload required to parse the dpkg
data on low resource installations.

The extra bloat of the XML tags alone is sufficient reason not to do
this, IMHO.

I like XML but it isn't the right choice for package descriptions IMHO.
Better to work with a simpler format but one that can still be parsed by
gettext so that TDebs can read the translation strings from binary .mo
files.

-- 


Neil Williams
=
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/




signature.asc
Description: This is a digitally signed message part


Re: What about use xml for descriptions of packages?

2008-05-25 Thread Fernando Cerezal
2008/5/25 Roberto C. Sánchez <[EMAIL PROTECTED]>:
> On Sun, May 25, 2008 at 02:40:07PM +0200, Fernando Cerezal wrote:
>> Hello,
>> I'm thinking about advantages and disadvantages of write the
>> description of the packages using XML.
>
> Personally, I would hate this.  I've written too many ant build.xml
> scripts to think that writing XML by hand is even a remotely sane thing
> to do.

Yes, you are right. However, currently the translations of the Debian
website are being done by hand, so there is the same problem and it
works enough fine. Besides, the descriptions are formated using
spaces, so

 * A description sentence with
two lines

and

 * A description sentence with
   two lines

Are interpreted as two difference descriptions. And a change such as
this do that, at least, 28 (the number of languages Debian is being
translated) translators have to review the translations.

However, something like

A description with lines

will not have that problems and will reduce the number of translations
to be reviwed without real modifications. I think this method will
help to reuse strings and reduce the number of strings to translate.

And using this, we could do things like:

The program foo is used for help you in your task http://logo_of_foo.jpg";>

So, apt can ignore img tags, but programs such synaptic can offer a
very much graphic experience.

I think the current method for descriptions is very limitied.

>
> Regards,
>
> -Roberto
>
> --
> Roberto C. Sánchez
> http://people.connexer.com/~roberto
> http://www.connexer.com
>
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.6 (GNU/Linux)
>
> iD8DBQFIOV+e5SXWIKfIlGQRAh10AKCYpH+ESHW8H7RD4J0dYdabMgHZdgCfZNAL
> 0ItGduqjWil437/bvIqK0aI=
> =F6l8
> -END PGP SIGNATURE-
>
>


--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: What about use xml for descriptions of packages?

2008-05-25 Thread Mikhail Gusarov
Twas brillig at 14:40:07 25.05.2008 UTC+02 when Fernando Cerezal did gyre and 
gimble:

 FC> I think using XML the descriptions can be rendered in different
 FC> form for text and graphical tools.

Same for current format. Just use perl/python/whatever instead of XSLT.

 FC> The URL of the descriptions can be real links and, even and the
 FC> project thinks it is appropiate, links to the logos, and perhaps
 FC> sponsors, of the project when description is showed in graphical
 FC> mode.

Same for current format.

 FC> Besides, I think will be more easy to develop a tool that manage
 FC> XML for manage descriptions and their translations than plain text
 FC> with its special details.

IMHO quite opposite.

 FC> And, of course, we will not format the descriptions using spaces.

While formatting is consistent, it may be converted to anything you
want.

-- 


pgpnVMlU0evrM.pgp
Description: PGP signature


Re: What about use xml for descriptions of packages?

2008-05-25 Thread Roberto C . Sánchez
On Sun, May 25, 2008 at 02:40:07PM +0200, Fernando Cerezal wrote:
> Hello,
> I'm thinking about advantages and disadvantages of write the
> description of the packages using XML.

Personally, I would hate this.  I've written too many ant build.xml
scripts to think that writing XML by hand is even a remotely sane thing
to do.

Regards,

-Roberto

-- 
Roberto C. Sánchez
http://people.connexer.com/~roberto
http://www.connexer.com


signature.asc
Description: Digital signature


What about use xml for descriptions of packages?

2008-05-25 Thread Fernando Cerezal
Hello,
I'm thinking about advantages and disadvantages of write the
description of the packages using XML.
I think using XML the descriptions can be rendered in different form
for text and graphical tools. The URL of the descriptions can be real
links and, even and the project thinks it is appropiate, links to the
logos, and perhaps sponsors, of the project when description is showed
in graphical mode.
Besides, I think will be more easy to develop a tool that manage XML
for manage descriptions and their translations than plain text with
its special details. And, of course, we will not format the
descriptions using spaces.

As disadvantage, we will need to develop a robot that do format for
the current descriptions and translations and we will have to review
all of them. Ciertainly, this will be an huge job. However, I think
the current scheme of descriptions is very limitied and is better to
do it earlier than translate more descriptions and then move all to a
new format in the future.

Is there any earlier discussion about this?

I sent this to i18n, but I had not reponse.

Regrets.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]