[pacman-dev] [PATCH 4/5] check for overflow when setting HTTP_USER_AGENT

2017-05-10 Thread Andrew Gregory
gcc7 issues a warning about a potential overflow if left unchecked.

Signed-off-by: Andrew Gregory 
---
 src/pacman/pacman.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/pacman/pacman.c b/src/pacman/pacman.c
index 605aec3e..11b7e6f0 100644
--- a/src/pacman/pacman.c
+++ b/src/pacman/pacman.c
@@ -271,10 +271,15 @@ static void setuseragent(void)
 {
char agent[101];
struct utsname un;
+   int len;
 
uname();
-   snprintf(agent, 100, "pacman/%s (%s %s) libalpm/%s",
+   len = snprintf(agent, 100, "pacman/%s (%s %s) libalpm/%s",
PACKAGE_VERSION, un.sysname, un.machine, 
alpm_version());
+   if(len >= 100) {
+   pm_printf(ALPM_LOG_WARNING, _("HTTP_USER_AGENT truncated\n"));
+   }
+
setenv("HTTP_USER_AGENT", agent, 0);
 }
 
-- 
2.12.2


[pacman-dev] [PATCH 5/5] remove unused byte from user agent buffer

2017-05-10 Thread Andrew Gregory
snprintf prints at most n bytes including the terminating '\0'.  The
extra reserved byte was never being used.

Signed-off-by: Andrew Gregory 
---
 src/pacman/pacman.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/pacman/pacman.c b/src/pacman/pacman.c
index 11b7e6f0..8c7f715e 100644
--- a/src/pacman/pacman.c
+++ b/src/pacman/pacman.c
@@ -269,7 +269,7 @@ static void localize(void)
  */
 static void setuseragent(void)
 {
-   char agent[101];
+   char agent[100];
struct utsname un;
int len;
 
-- 
2.12.2


[pacman-dev] [PATCH 2/5] query_fileowner: avoid buffer overflow

2017-05-10 Thread Andrew Gregory
Copying a string into a buffer that  has just been determined to not be
able to hold it is obviously incorrect.  The actual error handling
appears to have been unintentionally removed in
47762ab687959e48acc2de8592fcf3ba3cfa502b.

Signed-off-by: Andrew Gregory 
---
 src/pacman/query.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/pacman/query.c b/src/pacman/query.c
index 119764bc..247423fa 100644
--- a/src/pacman/query.c
+++ b/src/pacman/query.c
@@ -205,6 +205,7 @@ static int query_fileowner(alpm_list_t *targets)
size_t rlen = strlen(rpath);
if(rlen + 2 >= PATH_MAX) {
pm_printf(ALPM_LOG_ERROR, _("path too 
long: %s/\n"), rpath);
+   goto targcleanup;
}
strcat(rpath + rlen, "/");
}
-- 
2.12.2


[pacman-dev] [PATCH 3/5] alpm_list: abort on memory allocation failure

2017-05-10 Thread Andrew Gregory
This makes it possible to detect a failure in several alpm_list
functions.  Previously these functions would continue after a failure,
returning partial results and potentially leaking memory.
Unfortunately, NULL is a valid return value for the affected functions
if the input list is empty, so they still do not have a dedicated error
value.  Callers can at least detect an error by checking if the input
list was empty.

Signed-off-by: Andrew Gregory 
---

There are more functions that fail silently, but they can't be fixed without
breaking API.

 lib/libalpm/alpm_list.c | 29 -
 1 file changed, 24 insertions(+), 5 deletions(-)

diff --git a/lib/libalpm/alpm_list.c b/lib/libalpm/alpm_list.c
index 0f1b819c..334f67ba 100644
--- a/lib/libalpm/alpm_list.c
+++ b/lib/libalpm/alpm_list.c
@@ -461,7 +461,10 @@ alpm_list_t SYMEXPORT *alpm_list_remove_dupes(const 
alpm_list_t *list)
alpm_list_t *newlist = NULL;
while(lp) {
if(!alpm_list_find_ptr(newlist, lp->data)) {
-   newlist = alpm_list_add(newlist, lp->data);
+   if(alpm_list_append(, lp->data) == NULL) {
+   alpm_list_free(newlist);
+   return NULL;
+   }
}
lp = lp->next;
}
@@ -480,7 +483,10 @@ alpm_list_t SYMEXPORT *alpm_list_strdup(const alpm_list_t 
*list)
const alpm_list_t *lp = list;
alpm_list_t *newlist = NULL;
while(lp) {
-   newlist = alpm_list_add(newlist, strdup(lp->data));
+   if(alpm_list_append_strdup(, lp->data) == NULL) {
+   FREELIST(newlist);
+   return NULL;
+   }
lp = lp->next;
}
return newlist;
@@ -498,7 +504,10 @@ alpm_list_t SYMEXPORT *alpm_list_copy(const alpm_list_t 
*list)
const alpm_list_t *lp = list;
alpm_list_t *newlist = NULL;
while(lp) {
-   newlist = alpm_list_add(newlist, lp->data);
+   if(alpm_list_append(, lp->data) == NULL) {
+   alpm_list_free(newlist);
+   return NULL;
+   }
lp = lp->next;
}
return newlist;
@@ -523,8 +532,15 @@ alpm_list_t SYMEXPORT *alpm_list_copy_data(const 
alpm_list_t *list,
void *newdata = malloc(size);
if(newdata) {
memcpy(newdata, lp->data, size);
-   newlist = alpm_list_add(newlist, newdata);
+   if(alpm_list_append(, newdata) == NULL) {
+   free(newdata);
+   FREELIST(newlist);
+   return NULL;
+   }
lp = lp->next;
+   } else {
+   FREELIST(newlist);
+   return NULL;
}
}
return newlist;
@@ -552,7 +568,10 @@ alpm_list_t SYMEXPORT *alpm_list_reverse(alpm_list_t *list)
list->prev = NULL;
 
while(lp) {
-   newlist = alpm_list_add(newlist, lp->data);
+   if(alpm_list_append(, lp->data) == NULL) {
+   alpm_list_free(newlist);
+   return NULL;
+   }
lp = lp->prev;
}
list->prev = backup; /* restore tail pointer */
-- 
2.12.2


Re: [pacman-dev] [PATCH] Handle empty string passed to query_owner

2017-05-10 Thread Andrew Gregory
On 05/09/17 at 03:46pm, Allan McRae wrote:
> Passing an empty string to pacman -Qo results in:
> error: No package owns 
> 
> Treat an empty string being passed the same as recieving a NULL value
> and exit searching for an owner.
> 
> Signed-off-by: Allan McRae 
> ---
> 
> Running 'pacman -Qo ""' will now just exit pacman. I'm not sure if we
> need an error/warning message for this case.

I don't think we should ever fail silently.  I would include an error
the same as any other path we can't resolve.

>  src/pacman/query.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/pacman/query.c b/src/pacman/query.c
> index a8417570..119764bc 100644
> --- a/src/pacman/query.c
> +++ b/src/pacman/query.c
> @@ -161,7 +161,8 @@ static int query_fileowner(alpm_list_t *targets)
>   size_t len, is_dir;
>   unsigned int found = 0;
>  
> - if((filename = strdup(t->data)) == NULL) {
> + filename = strdup(t->data);
> + if(filename == NULL || strcmp(filename, "") == 0) {
>   goto targcleanup;
>   }
>  
> -- 
> 2.12.2


Re: [pacman-dev] [PATCH] Do not continuously try to open an invalid database

2017-05-10 Thread Andrew Gregory
On 05/09/17 at 11:38am, Allan McRae wrote:
> If you manage to download a bad database (e.g. an html file when
> behind a proxy or with a badly configured webserver), pacman makes
> sure you know about it.  Here is some example output:
> 
> error: could not open file /var/lib/pacman/sync/extra.db: Unrecognized 
> archive format
> error: could not open file /var/lib/pacman/sync/extra.db: Unrecognized 
> archive format
> error: could not open file /var/lib/pacman/sync/extra.db: Unrecognized 
> archive format
> error: could not open file /var/lib/pacman/sync/extra.db: Unrecognized 
> archive format
> error: could not open file /var/lib/pacman/sync/extra.db: Unrecognized 
> archive format
> error: could not open file /var/lib/pacman/sync/extra.db: Unrecognized 
> archive format
> error: could not open file /var/lib/pacman/sync/extra.db: Unrecognized 
> archive format
> error: could not open file /var/lib/pacman/sync/extra.db: Unrecognized 
> archive format
> error: could not open file /var/lib/pacman/sync/extra.db: Unrecognized 
> archive format
> error: could not open file /var/lib/pacman/sync/extra.db: Unrecognized 
> archive format
> error: could not open file /var/lib/pacman/sync/extra.db: Unrecognized 
> archive format
> error: could not open file /var/lib/pacman/sync/extra.db: Unrecognized 
> archive format
> error: could not open file /var/lib/pacman/sync/extra.db: Unrecognized 
> archive format
> error: could not open file /var/lib/pacman/sync/extra.db: Unrecognized 
> archive format
> error: could not open file /var/lib/pacman/sync/extra.db: Unrecognized 
> archive format
> error: could not open file /var/lib/pacman/sync/extra.db: Unrecognized 
> archive format
> error: could not open file /var/lib/pacman/sync/extra.db: Unrecognized 
> archive format
> error: could not open file /var/lib/pacman/sync/extra.db: Unrecognized 
> archive format
> error: could not open file /var/lib/pacman/sync/extra.db: Unrecognized 
> archive format
> error: could not open file /var/lib/pacman/sync/extra.db: Unrecognized 
> archive format
> error: could not open file /var/lib/pacman/sync/extra.db: Unrecognized 
> archive format
> error: could not open file /var/lib/pacman/sync/extra.db: Unrecognized 
> archive format
> error: could not open file /var/lib/pacman/sync/extra.db: Unrecognized 
> archive format
> error: could not open file /var/lib/pacman/sync/extra.db: Unrecognized 
> archive format
> 
> I don't know how many times that gets printed because it goes beyond my 
> scrollback
> buffer.
> 
> Flag a database that we can "open" and "fstat" but not read from as invalid 
> to avoid
> this.
> 
> Signed-off-by: Allan McRae 
> ---
>  lib/libalpm/be_sync.c | 3 +++
>  lib/libalpm/util.c| 1 +
>  2 files changed, 4 insertions(+)
> 
> diff --git a/lib/libalpm/be_sync.c b/lib/libalpm/be_sync.c
> index 06f509a6..46959298 100644
> --- a/lib/libalpm/be_sync.c
> +++ b/lib/libalpm/be_sync.c
> @@ -486,6 +486,9 @@ static int sync_db_populate(alpm_db_t *db)
>   fd = _alpm_open_archive(db->handle, dbpath, ,
>   , ALPM_ERR_DB_OPEN);
>   if(fd < 0) {
> + if(db->handle->pm_errno == ALPM_ERR_DB_INVALID) {
> + db->status &= DB_STATUS_INVALID;
> + }

ACK

>   return -1;
>   }
>   est_count = estimate_package_count(, archive);
> diff --git a/lib/libalpm/util.c b/lib/libalpm/util.c
> index 35fc7f41..0fb9bb6d 100644
> --- a/lib/libalpm/util.c
> +++ b/lib/libalpm/util.c
> @@ -268,6 +268,7 @@ int _alpm_open_archive(alpm_handle_t *handle, const char 
> *path,
>   if(archive_read_open_fd(*archive, fd, bufsize) != ARCHIVE_OK) {
>   _alpm_log(handle, ALPM_LOG_ERROR, _("could not open file %s: 
> %s\n"),
>   path, archive_error_string(*archive));
> + error = ALPM_ERR_DB_INVALID;

We shouldn't be using a db-specific error here.  _alpm_open_archive is
used for packages as well.  If we want to differentiate between open
and format failures we'll either have to take a second error code or
set return values and let the caller set the handle error based on
that.  I think the ALPM_ERR_*_OPEN codes we're using for this function
now are fine, though.

>   goto error;
>   }
>  
> -- 
> 2.12.2


Re: [pacman-dev] Repository management

2017-05-10 Thread Andrew Gregory
On 05/09/17 at 10:54pm, Allan McRae wrote:
> Hi all,
> 
> Every time I attempt to work on repo-add, I find it to be a very
> difficult endeavour.  Even though it is half the size of makepkg
> (without even including any of libmakepkg), it is much more convoluted
> to work on.
> 
> We also have a weird repository database system.  We have:
> - .db dbs with package information, signatures and delta information
> - .files dbs that are the same as .db dbs but additionally include filelists
> 
> There are two reasons the .files dbs replicate all information in the
> .db dbs
>  - .db and .files dbs getting out of sync could cause issues
>  - a complete database is useful for things like archweb, mostly to
> avoid the above
> 
> I would also like to include information on source packages to these
> databases.  The files information is separate due to wanting our primary
> database to be small.  Likewise, source package information needs to be
> separate (the signatures take most of the size in the .db dbs, so adding
> source package signatures effectively doubles the size).
> 
> So two points up for discussion:
> 
> 
> 1) Sync repository layout?  I don't see any point in leaving the tar
> based format, as reading of sync databases is not a bottleneck.  (The
> local db format can be a bottleneck, but that is a separate discussion...)
> 
> Do we split the information in .db out of .files and add a .full db with
> complete information?  Then any .src db could follow suit and just have
> source package information.  How do we get around the out of sync issue
> (e.g., a package is removed from .db, but we have an old .files database
> with it).  Do we add timestamps, and print a warning on -F operations
> when the two are out of sync?
 
What about just not including the signature in the database?  Make the
inclusion of the signature optional and have pacman (or whatever
downloads the source package) also look for a corresponding .sig file
if it's not in the db.  pacman -U already looks for a .sig file when
downloading a package and you have a feature request to download .sig
files even with -S, so code-wise this seems like a pretty clean
solution. Then you can include the source information right in the
primary DB and Arch's devtools can opt to omit the signature from the
db.
 
> 2) Do we need a better (read "more easily maintainable") tool for
> handling database generation and updates?  libalpm already can read in
> information package files, so we could add libalpm/db_write.c with the
> database creation functions.   Should we unify our repo format with our
> local database format which we already write?

I would love to see us drop the ini-style .PKGINFO format, if that's
what you mean.  Even without adding a database writer to libalpm,
having two formats for the exact same data is unnecessary and leads to
inconsistencies between the two.

apg


Re: [pacman-dev] Repository management

2017-05-10 Thread Allan McRae
On 11/05/17 02:54, Dave Reisner wrote:
> On Tue, May 09, 2017 at 10:54:44PM +1000, Allan McRae wrote:
>> Hi all,
>>
>> Every time I attempt to work on repo-add, I find it to be a very
>> difficult endeavour.  Even though it is half the size of makepkg
>> (without even including any of libmakepkg), it is much more convoluted
>> to work on.
>>
>> We also have a weird repository database system.  We have:
>> - .db dbs with package information, signatures and delta information
>> - .files dbs that are the same as .db dbs but additionally include filelists
>>
>> There are two reasons the .files dbs replicate all information in the
>> .db dbs
>>  - .db and .files dbs getting out of sync could cause issues
>>  - a complete database is useful for things like archweb, mostly to
>> avoid the above
>>
>> I would also like to include information on source packages to these
>> databases.  The files information is separate due to wanting our primary
>> database to be small.  Likewise, source package information needs to be
>> separate (the signatures take most of the size in the .db dbs, so adding
>> source package signatures effectively doubles the size).
>>
>> So two points up for discussion:
>>
>>
>> 1) Sync repository layout?  I don't see any point in leaving the tar
>> based format, as reading of sync databases is not a bottleneck.  (The
>> local db format can be a bottleneck, but that is a separate discussion...)
> 
> Isn't this a historical reversal? IIRC, the sync DBs used to be expanded
> onto disk, and we decided to leave them as tarballs to address
> performance/fragmentation concerns.

To be clear, I was saying to stay tar based and not to move to something
else.

>> Do we split the information in .db out of .files and add a .full db with
>> complete information?  Then any .src db could follow suit and just have
>> source package information.  How do we get around the out of sync issue
>> (e.g., a package is removed from .db, but we have an old .files database
>> with it).  Do we add timestamps, and print a warning on -F operations
>> when the two are out of sync?
>>
>>
>> 2) Do we need a better (read "more easily maintainable") tool for
>> handling database generation and updates?  libalpm already can read in
>> information package files, so we could add libalpm/db_write.c with the
>> database creation functions.   Should we unify our repo format with our
>> local database format which we already write?
>>
> 
> I'd urge you not to make this a part of pacman. It's too far off the
> beaten path for most users to make it a part of an already complicated
> tool.
> 

Definitely not part of pacman.  I was suggesting another program with a
libalpm backend.

>>
>> I am looking for ideas here.  Please brainstorm to your hearts content.
> 
> WRT replacing repo-add, I'd suggest we come up with a the use cases we
> want to support, design an interface to meet them, and then come up with
> the implementation. Might be nice to start with the Arch Linux
> repository layout as an example that we'd want to support (pooled
> packages with symlinks into repo dirs).
> 
>> Cheers,
>> Allan
> .
> 


Re: [pacman-dev] Repository management

2017-05-10 Thread Dave Reisner
On Tue, May 09, 2017 at 10:54:44PM +1000, Allan McRae wrote:
> Hi all,
> 
> Every time I attempt to work on repo-add, I find it to be a very
> difficult endeavour.  Even though it is half the size of makepkg
> (without even including any of libmakepkg), it is much more convoluted
> to work on.
> 
> We also have a weird repository database system.  We have:
> - .db dbs with package information, signatures and delta information
> - .files dbs that are the same as .db dbs but additionally include filelists
> 
> There are two reasons the .files dbs replicate all information in the
> .db dbs
>  - .db and .files dbs getting out of sync could cause issues
>  - a complete database is useful for things like archweb, mostly to
> avoid the above
> 
> I would also like to include information on source packages to these
> databases.  The files information is separate due to wanting our primary
> database to be small.  Likewise, source package information needs to be
> separate (the signatures take most of the size in the .db dbs, so adding
> source package signatures effectively doubles the size).
> 
> So two points up for discussion:
> 
> 
> 1) Sync repository layout?  I don't see any point in leaving the tar
> based format, as reading of sync databases is not a bottleneck.  (The
> local db format can be a bottleneck, but that is a separate discussion...)

Isn't this a historical reversal? IIRC, the sync DBs used to be expanded
onto disk, and we decided to leave them as tarballs to address
performance/fragmentation concerns.

> Do we split the information in .db out of .files and add a .full db with
> complete information?  Then any .src db could follow suit and just have
> source package information.  How do we get around the out of sync issue
> (e.g., a package is removed from .db, but we have an old .files database
> with it).  Do we add timestamps, and print a warning on -F operations
> when the two are out of sync?
> 
> 
> 2) Do we need a better (read "more easily maintainable") tool for
> handling database generation and updates?  libalpm already can read in
> information package files, so we could add libalpm/db_write.c with the
> database creation functions.   Should we unify our repo format with our
> local database format which we already write?
> 

I'd urge you not to make this a part of pacman. It's too far off the
beaten path for most users to make it a part of an already complicated
tool.

> 
> I am looking for ideas here.  Please brainstorm to your hearts content.

WRT replacing repo-add, I'd suggest we come up with a the use cases we
want to support, design an interface to meet them, and then come up with
the implementation. Might be nice to start with the Arch Linux
repository layout as an example that we'd want to support (pooled
packages with symlinks into repo dirs).

> Cheers,
> Allan