Calibre is also a great idea/solution for this project, unless one of the goals of the project is to learn about databases and how to use a database to solve a real world problem. I added the second clause, because just learning about databases an admirable goal, but imo, one also should learn how to build the bits that allow an end user to create/read/update/delete data in the database. Django is one vehicle for doing this. Paul, I have to disagree with your analysis of django...I have worked with many non-programmers on building useful projects with that framework. There is a learning curve, but it is not insurrmountable by a non-programmer. ymmv.
Mark On Mon, Aug 11, 2014 at 3:26 PM, Paul <[email protected]> wrote: > Hi Joel, > > Some comments on your diagram, in the order I thought of them, not of > importance: > > 1) I tend to use integers for primary keys. Smallints only go into the > thousands, and most of the stuff I have worked on has needed to be able > to scale well beyond that. Even things that seem small to start with > have a habit of growing unexpectedly, and you don't want to have to > change your database design down the road for something small like > that. > > Planning ahead really is the name of the game with databases, as > half the reason we use them to begin with is to handle things that > have grown beyond our simple spreadsheets, or that are expected to > become unwieldy in the future. The whole point of a good design is > that it should handle this sort of thing. And the extra storage space > really shouldn't be an issue. For example, a quick estimate > suggests that using integer instead of smallint for keys in the diagram > you gave will only use an extra 1 mb of storage space if you go all the > way up to the limit of the smallint range, and beyond that it is > academic as you would be *required* to use int instead of smallint. > > But for a home project you're unlikely to exceed that number, so this > is fine. It's just worth being aware of this when designing it, so that > you know the choices you have made, and why you have made them. > > I would make them unsigned smallints, though. Not really a good idea to > use negative numbers for keys, and so making them signed just wastes > half their range. > > 2) I wouldn't put a Read flag in the Book table. You have a Status > table with a ReadStatus, so this would be duplication of data. > > 3) Same with the SeriesComplete flag in the Series table. You can get > that information by doing a query that would check if all the books in > the series are in the read status. > > It's usually much better practice to generate this sort of information > as you need it from the relevant pieces of information in the database > (assuming you have all the required pieces in the database), than to > store this information a second time, which is just duplication. Even > if the space wasted isn't important, it just leads to the possibility > of the data being inconsistant. > > The only time one stores such things is when the actual calculation to > work this out from the pieces of data takes so long that it is actually > necessary to store calculated results to save time in the application. > > 4) Same with AuthorID in the Series table. Each book will have an > author, so you don't need a series author. That way if the books in the > series are written by different authors everything still works. This > way, who would the series author be, and how would that be any > different from the authors of the individual books? > > Note that you can still show this information in the frontend when > viewing the series. You can make a query to check if the authors of all > the books in the series are the same, and if so show that author, but > if they are different just show "various" or something. > > Also, when entering all the books in a series, you can still have the > frontend ask you for the series information first, including author, > and if you fill it in then it will prepopulate the author field when > adding each book. > > Of course, maybe you actually do have a use for a series author as > separate to the authors of the individual books, so maybe this does > make sense to you, it just doesn't seem right to me. > > 5) Some people want author names to be a single field, which is simpler > for some things, and avoids hassles with authors with multiple names, > and other people want them to be a name and surname field, which does > mean you can do things like order by surname or order by first name. And > avoids the problem of you sometimes entering the name surname first and > sometimes first name first. > > With the name split into parts it is easier to check if the author > exists by both name and surname when entering new authors, so even if > you mistype one of them the author will still be found if they are > already in the database, but this can still be done if the name is all > in one field, it is just harder. You can also still sort the author > list by surname and by first name, it is just harder if both are in one > field. > > The choice is up to you. > > 6) I would make the ReadStatus slightly larger, say text(30). > If you use varchar there is no wasted space if some of the entries have > less text, so you could even make this varchar(100) or something. Or > give it a short code and a longer description. Something like "could not > finish" is already too long for 15 characters. Sooner or later you're > going to want to add a Status that is longer than the 15... > > 7) The Status table is incorrect. The way you have it you could have > multiple statuses for a book, and each one would be a hand typed status > description. One time you might type "reading", and another time > you might mistype "raeding". Instead, you should change the BookID to a > StatusID in the Status table, and add a StatusID to the Book table. The > PageNum field would also need to move to the Book table. > > This way you would put all the possible statuses into the Status table, > and each book would have a link to that table, showing which of those > possible statuses the book currently is in. > > Uh... > > I think that's all for now. :) > > Otherwise it looks good. You're almost there. > > > Paul > > > > On Sun, 10 Aug 2014 18:48:21 -0700 > Joel Madero <[email protected]> wrote: > > > I decided to do the smart thing and diagram it out - the diagram > > isn't perfect (looks wise) but I hope that I can get a +1 before > > actually making the databases. > > > > Note: I decided to put rankings with date read table, this way I can > > have different rankings for the same book (for each instance that I > > read it, some books seem better - or worse - the second time > > around) :) > > > > Thoughts appreciated, a million thanks for all the advice given so > > far. > > > > Link: > > > https://drive.google.com/file/d/0B2kdRhc960qdZzJZenR1Qno2LWM/edit?usp=sharing > > > > > > Best, > > Joel > > > > > > > > On 08/10/2014 01:35 PM, Paul wrote: > > > > > > > > > On Sun, 10 Aug 2014 21:30:18 +0100 > > > Mark Bourne <[email protected]> wrote: > > > > > >> It looks like each book should only have one of the "ReadStatus" > > >> flags set, so I'd make that an enum field on the BookInformation > > >> table, with possible values of "Not Read", "Reading" and "Read". > > >> You can set the default value for the field to "Not Read" so a new > > >> record will be set to that status if no value is specified for > > >> that field. > > > That's essentially the same idea as the status tables I was > > > suggesting. In my experience we've always used status tables, so I > > > would suggest those. Partly because I'm not familiar with database > > > support for enums. How well supported and widely supported is that > > > by the most common databases? > > > > > >> "NumberOfBooksByAuthor" and "ReadBooksByAuthor" are not needed on > > >> the "AuthorInfo" table - you can get those by querying the > > >> database. I may have the syntax slightly wrong here, but along the > > >> lines of: SELECT `ai`.`AuthorID`, `ai`.`AuthorName`, > > >> COUNT(`bi`.`BookID`) from `AuthorInfo` `ai` LEFT JOIN > > >> `BookInformation` `bi` ON `bi`.`AuthorID` = `ai`.`AuthorID` GROUP > > >> BY `ai`.`AuthorID` should give the number of books by each author. > > >> Add: AND `bi`.`ReadStatus` = "Read" > > >> to the ON condition and you can get the number of read books by > > >> each author. > > >> > > >> That's the kind of thing a database enables you to do much more > > >> easily than with a spreadsheet ;o) > > >> > > >> Mark. > > >> > > >> > > >> Joel Madero wrote: > > >>> Hi All, > > >>> > > >>> So I went back to planning stage. Link to what I think might work > > >>> - hoping to get the planning stage done today so I can start > > >>> actually putting together the db - I have 3 days off so now's a > > >>> good time for me to get the basic structure together :) Thanks in > > >>> advance! > > >>> > > >>> > https://drive.google.com/file/d/0B2kdRhc960qdbGJIQ1M3NWtrdmc/edit?usp=sharing > > >>> > > >>> > > >>> Best, > > >>> Joel > > > > > > > > > > -- > To unsubscribe e-mail to: [email protected] > Problems? > http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ > Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette > List archive: http://listarchives.libreoffice.org/global/users/ > All messages sent to this list will be publicly archived and cannot be > deleted > > -- To unsubscribe e-mail to: [email protected] Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
