Re: [Bug-wget] /bin/sh: msgfmt: not found

2012-05-30 Thread illusionoflife
On Wednesday, May 30, 2012 09:51:50 Akram Kh Almatari wrote:
 Dear Sir,
 
 
 I have the following error  after issue the MAKE command ,what is the cause
 for this error ,if there possibility to help me in this code error? Also I
 have attached the result of  the configure command
Do you have *core/gettext* installed?

-- 
Best regards, illusionoflife
Contact me on illusion.of.lif...@gmail.com 
or sip:illsionofl...@ekiga.net
Please, read rfc1855, if did not already.


signature.asc
Description: This is a digitally signed message part.


[Bug-wget] Segfault with WARC + CDX

2012-05-30 Thread Gijs van Tulder

Hi,

There's a bug in the warc_find_duplicate_cdx_record function. If you 
provide a file with CDX records, Wget can segfault if a record is not 
found in the CDX file. In fact, the deduplication now only works if 
*every* new record can be found in the CDX index.


The segmentation fault is generated on these lines in src/warc.c:

  hash_table_get_pair (warc_cdx_dedup_table, sha1_digest_payload, key,
   rec_existing);
  if (rec_existing != NULL  strcmp (rec_existing-url, url) == 0)

Other than the code expects hash_table_get_pair does not set 
rec_existing to NULL if no record is found. So instead of checking for 
NULL, the function should check if the return value of 
hash_table_get_pair is non-zero:


  int found = hash_table_get_pair (warc_cdx_dedup_table, 
sha1_digest_payload,

   key, rec_existing);
  if (found  strcmp (rec_existing-url, url) == 0)

The attached patch makes this change. The deduplication works better.

Regards,

Gijs
From 807b98d7d9289765c9f210336d2dbf294d663f99 Mon Sep 17 00:00:00 2001
From: Gijs van Tulder gvtul...@gmail.com
Date: Wed, 30 May 2012 23:00:04 +0200
Subject: [PATCH] warc: Fix segfault if CDX record is not found.

---
 src/ChangeLog |4 
 src/warc.c|6 +++---
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/src/ChangeLog b/src/ChangeLog
index 7e16b17..9e74e47 100644
--- a/src/ChangeLog
+++ b/src/ChangeLog
@@ -1,3 +1,7 @@
+2012-05-30  Gijs van Tulder  gvtul...@gmail.com
+
+	* warc.c: Fix segfault if CDX record is not found.
+
 2011-05-26  Steven Schweda  s...@antinode.info
 	* connect.c [HAVE_SYS_SOCKET_H]: Include sys/socket.h.
 	[HAVE_SYS_SELECT_H]: Include sys/select.h.
diff --git a/src/warc.c b/src/warc.c
index 24751db..92a49ef 100644
--- a/src/warc.c
+++ b/src/warc.c
@@ -1001,10 +1001,10 @@ warc_find_duplicate_cdx_record (char *url, char *sha1_digest_payload)
 
   char *key;
   struct warc_cdx_record *rec_existing;
-  hash_table_get_pair (warc_cdx_dedup_table, sha1_digest_payload, key,
-   rec_existing);
+  int found = hash_table_get_pair (warc_cdx_dedup_table, sha1_digest_payload,
+   key, rec_existing);
 
-  if (rec_existing != NULL  strcmp (rec_existing-url, url) == 0)
+  if (found  strcmp (rec_existing-url, url) == 0)
 return rec_existing;
   else
 return NULL;
-- 
1.7.4.1



Re: [Bug-wget] alpha release (1.13.4.56-620c)

2012-05-30 Thread Steven M. Schweda
It's not really your fault, but the GNU regex code (lib/regcomp.c)
 has an annoying portability problem, [...]

   The gnulib folks have fixed this regcomp.c problem.  I don't much
like the solution they adopted, but it does get past more compilers than
the old code.

  http://lists.gnu.org/archive/html/bug-gnulib/2012-05/msg00276.html



   Steven M. Schweda   sms@antinode-info
   382 South Warwick Street(+1) 651-699-9818
   Saint Paul  MN  55105-2547