[Unicon-group] Fwd: Re: Ruby Python vs. Icon/Unicon

Kent Palmer Fri, 09 Mar 2007 21:54:38 -0800

Barry---

You make a good point, I completely skipped ICON.I used Snobol for years and then one day Iwondered if there had been any updates to thelanguage and ran in to Icon and Unicon, but bythat time Unicon already existed.

Basically I use the Icon subset of Unicon if theprogram is very large, but mostly I just use thestring parsing in a single straight program without any function calls.

I never saw the ICON book. I think the online PDFbooks are great, normally I find what I need in one of them.


But it is true that if I had to buy a book I would probably stuck to Snobol.

I kind of like Snobol because it reminds me of assembly language programming.

I tell people about Snobol/Icon/Unicon all thetime, because it is such a great tool, but no one is ever interested.

And I don't understand why, because if you needto manipulate text it is the only way to go.

I have been working on a text related projectrecently and tried to do some searches in VBA for Word.

That is really despicable. I just could not getat what I needed, because you can grab onlycharacters, words or paragraphs, no LINES becausethe lines are in a flow. And then you have tomanipulate either selections or ranges. And thenonce you have the ranges then you only have veryprimitive string manipulation functions to deal with them.

I asked around various people and no one I knowdoes any VBA programming in Word, everyone uses VBA with excel.

When I mentioned Snobol/Icon/Unicon to the peopleI was talking to about my problem, I got onlyblank stares until I got to Snobol and theneveryone says at that point, that they thoughtthat was just a relic of computing history, andthey are shocked that there were follow on languages in that family.

I had to solve my problem outside of Word. I didnot in this case solve it with Unicon, but with aconcordance program. I have a wonderfulconcordance program fromhttp://www.concordancesoftware.co.uk/ which didthe trick, because the text was so malformed thatit would have taken for ever to write a programthat would recognize all the cases.

But I still had to do a lot of hand manipulation,in order to create the files I needed from the concordance program output.

It seems to me no one has really solved theproblem of text manipulation on the fly.

I have to have a certain size problem before itis worth writing a program in Unicon to solve theproblem. I am constantly using multiple tools toprocess the data and massage it into a neededformat on all the text that is not worthy ofwriting a Unicon program to process.

So it seems to me that I must not be the only onedoing this. I think there must be a huge hiddenmarket out there for something that manipulatestext which is more powerful than regularexpressions and search, but less of a problemthan programming in Unicon. I have seen severalsuites of special purpose tools (for examplehttp://www.boxersoftware.com/textmonkey.htm ),but I have never seen anything that solves thisproblem between what is possible by usingspreadsheets together with wordprocessingprograms and what is possible using UNICON.

Perhaps if Unicon addressed this hidden marketsomehow, it could find its niche. Instead ofprogram examples we need something like templatesby which programs could be altered to do slightlydifferent processing. The problem is that toolsare too specific, but to get variation you needthe complete generality of the programminglanguage. There should be something between theseextremes where there were adaptable tools thatwere flexible and changable but still not general.

This is the idea of Domain Specific Languages butI don't think there is anything equivalent tothat in the text manipulation world.

I was hoping that XML would fill the bill but itreally only makes a bigger problem, because mostof the texts do not have tags, and placing thetags in the text is just as big a problem aswriting a text recognition program.

For instance, in my latest effort I triedproducing the XML output of Word 2007 and thenunzipping it go get the core document. But thatXML had the text so split up that it was going tobe a bigger problem than just dealing with the text in bulk.

Once you get outside of Word by dumping a txtfile then many times the text is so screwed up that it is difficult to parse.

So if the TXT is screwed up and the thingssearched are malformed, then building a textrecognizer is complicated. So that is where theconcordance program came in handy.

I wish that once I had the concordance programoutput that I could then parse that. But it wasnot worth the time write that program even thoughthe concordance program produces txt and html versions of the concordance.

Anyway I know I am waffling a bit, but it justseems that there is a middle ground of textmanipulation for which the tools are missing.


I heard today that 161 extabytes (10^18) of data was produced this last year.

In 2003 it was 5 extabytes.

It seems that with all that data, there should bea market for a good text manipulation language.

Especially one that supported some sort of middlelevel tools for problems where writing a programis not possible, but there is a lot of text processing to be done by hand.

Perhaps there is some level between text and xmlwhich we are missing out on, but where there is asignificant amount of work performed.

I know that many times I am using multiple toolsto get the final result I am seeking, were I ammassaging the document many times with differenttools and in multiple passes where each pass doessomething a little different to it toward achieving the final result.

For instance, one trick I am sure you have allused is to do search and replace or insert into atext document to put in the hooks that I wouldsearch for with my text recognizer. Sometimesthat is tricky with wildcards. An excellent toolin this regard is SRhttp://www.funduc.com/search_replace.htm forsearch and replace across multiple files.


Kent Palmer

X-Verify-SMTP: Host 66.35.250.225 sending to us was not listening
Date: Fri, 9 Mar 2007 05:21:39 -0600
From: [EMAIL PROTECTED]
To: unicon-group@lists.sourceforge.net
User-Agent: Mutt/1.5.13 (2006-08-11)
X-Spam-Score: 1.2 (+)
X-Spam-Report: Spam Filtering performed by sourceforge.net.
        See http://spamassassin.org/tag/ for more details.
        Report problems to
        http://sf.net/tracker/?func=add&group_id=1&atid=200001
        0.2 NO_REAL_NAME           From: does not include a real name
        1.0 FORGED_RCVD_HELO       Received: contains a forged HELO
Subject: Re: [Unicon-group] Ruby Python vs. Icon/Unicon
X-BeenThere: unicon-group@lists.sourceforge.net
X-Mailman-Version: 2.1.8
List-Id: Unicon programming language discussion list
        <unicon-group.lists.sourceforge.net>

List-Unsubscribe:<https://lists.sourceforge.net/lists/listinfo/unicon-group>,


<mailto:[EMAIL PROTECTED]>

List-Archive:<http://sourceforge.net/mailarchive/forum.php?forum=unicon-group>

List-Post: <mailto:unicon-group@lists.sourceforge.net>
List-Help: <mailto:[EMAIL PROTECTED]>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/unicon-group>,
        <mailto:[EMAIL PROTECTED]>
Sender: [EMAIL PROTECTED]
X-Server: High Performance Mail Server - http://surgemail.com r=1653887525
X-Rcpt-To: <[EMAIL PROTECTED]>

X-SpamDetect-Info: This message may be spam seehttp://www.smitespam.com for more informationX-SpamDetect: *: 1.189000From4consonants=1.0,From: does not include areal name=0.3,X-Verify-SMTP present=0.6,Aspam=-0.8

X-NotAscii: charset=us-ascii
X-Avast: Message is clean
X-IP-stats: Incoming Last 0, First 6, in=54, out=0, spam=0
X-External-IP: 66.35.250.225
X-ChoiceMail-OriginalAccount: [EMAIL PROTECTED]
X-ChoiceMail-AcceptedReason: Mailing List Message

bryan rasmussen <[EMAIL PROTECTED]> wrote:
> I'm a newbie with Unicon, I basically decided to start with it
> because, well I like learning new languages especially  ones with an
> easily perceived niche. I think the niche of Unicon as you say is text
> processing.
>
> I use XML a lot in my day to day, I'm not sure I understand the
> assertion that Unicon would be great for XML, since the main thing one
> needs for XML programming is easy tree manipulation, such as is
> provided with XSL-T.

There's an XML parser in the uni/xml directory of the CVS. I haven't
used it a lot, though it is likely I will be soon.

I agree with Kent Palmer that the syntax of Icon/Unicon isn't great,
but I'm not sure how much effect that has on popularity, considering
that languages with significant syntax difficulties, such as Perl and
C++, have been adopted anyway. Personally, my best hypothesis would be
that Icon hasn't become more popular mainly because you had to go buy
the book, 'The Icon Programming Language', and thus only the most
curious people ever tried Icon, for they had to go to some effort and
spend money doing it. Perl is an example of a language that became
popular in part because you could try it without buying a book, though
of course you might buy a book or two later. The copylefted Unicon
book as a PDF is a start towards a remedy, but you need more stuff
similar to the Perl manpage(s), the GNU info pages, on-line HTML
tutorials and documentation, and so forth.

--
Barry.SCHWARTZ Äe chemoelectric punkto org  http://chemoelectric.org
              Free stuff / Senpagaj varoj:  http://crudfactory.com
'Democracies don't war; democracies are peaceful countries.' - Bush
(http://www.whitehouse.gov/news/releases/2005/12/20051219-2.html)

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Unicon-group mailing list
Unicon-group@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/unicon-group

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.3 (GNU/Linux)

iD8DBQFF8UNDBNGXDWV0vIMRAt0MAJ91CTXOmYnavH4srIMhExqDWy8G8gCdEVhM
ca9z4+zbzzjKDN0IDgJpO4c=
=pAqH
-----END PGP SIGNATURE-----

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

_______________________________________________
Unicon-group mailing list
Unicon-group@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/unicon-group

[Unicon-group] Fwd: Re: Ruby Python vs. Icon/Unicon

Reply via email to